Java Garbage collection causing report server slowdowns

jmwhitehead · 10 Sep 2009 06:51:34 AM

Our current cognis installation is on 2 sun solaris zones running on an M4000 server. The first zone has the gateway and the application tier installed and the second has the content manager and another application tier installed.

In general the cpu usage of the cognos java process on either zone is usually > 1% except when around every hour when garbage collection is occuring, then the cpu usage rises to around 5% and the reports service response becomes slow for the end users who are logging in or navigating around cognos connection. When the garbage collection finishes the performance of the portal is a lot quicker and the cpu usage returns to its sub 1% level.

Has anyone else noticed this type of behaviour?

I've followed the instructions in the cognos java tuning doc to turn on the GC logging which is how I identified the issue but it doesn't offer any advice on how to improve the system with regards to speeding up the GC process or make it happen less frequently.

I did reduce the memory allocated from 1152 to 768 but apart from making the GC more frequent it didn't allievate the service slowdowns, I'm now thinking I should go the other way and make it a lot bigger, perhaps 2048 to see if that reduces the frequency of the GC even if it doesn't remove the slowdowns when it happens.

Also in the GC logs I get a lot of entries like this :-

141161.002: [GC 141161.002: [ParNew (promotion failed): 195665K->195665K(196480K), 0.0049551 secs]141161.007: [CMS (concurrent mode failure): 549611K->550081K(589824K), 1.1084267 secs] 745261K->550081K(786304K), 1.1136433 secs]

when the system is experencing it's higher cpu usage and slower portal performance.

Jonathan

kolonell · 10 Sep 2009 03:25:16 PM

Quote
In general the cpu usage of the cognos java process on either zone is usually > 1% except when around every hour when garbage collection is occuring, then the cpu usage rises to around 5% and the reports service response becomes slow for the end users who are logging in or navigating around cognos connection. When the garbage collection finishes the performance of the portal is a lot quicker and the cpu usage returns to its sub 1% level.

that looks like a full garbage collection is occuring ... that almost freezes all other threads in the java process hence the freeze in performance.

Quote
141161.002: [GC 141161.002: [ParNew (promotion failed): 195665K->195665K(196480K), 0.0049551 secs]
141161.007: [CMS (concurrent mode failure): 549611K->550081K(589824K), 1.1084267 secs]
745261K->550081K(786304K), 1.1136433 secs]

The promotion failed-entry means the collector can't put items from new to tenured generation (lack of space, fragmentation, ... ). When that occurs collection becomes full collection.

try increasing the Max heap to 1500M (Xmx) and the initial heap as well (Xms)
How much memory do the zones have for themselves (not shared)?

jmwhitehead · 14 Sep 2009 08:03:51 AM

Hi Kolonell,

Thanks for your reply, I had thought I might increase the heap size to something >= 1.5 gb, as its on a production machine I can only tweak it during off hours so I might try that this evening. On question is I know I can change the -Xmx via the cogconfig but there is no parameter entry in my bootstrap_solaris.xml file for the -Xms setting, is this something I should be adding? I notice that it is as a parameter in the cbs_cnfgtest_solaris.xml file.

Both zones have 4gb allocated each.

I have recently become aware of a tool called jvisualvm and using this I can see that the poor performance is almost certainly caused during periods of heavy garbage collection activity.

kolonell · 14 Sep 2009 10:13:36 AM

You can just add that as a <param></param> switch. If it is in the cfgtest_solaris.xml file that you can just copy the option from there and set it to 1500 (or something similar)

QuoteI have recently become aware of a tool called jvisualvm and using this I can see that the poor performance is almost certainly caused during periods of heavy garbage collection activity.

If that is the case solving this could be a very tricky one .. what is the memory usage of the Java process when the slow down occurs ?

jmwhitehead · 15 Sep 2009 04:45:12 AM

I've tweaked the memory parameter, i did set it to 2048 originally but that seemed to be causing the application tier java process to crash and restart, I've reduced it down to 1536 to see if it stays alive at that setting.

When I first encountered this issue I opened an SR with IBM support and they sent me a link to a doc that showed me how to set up gc logging, one of the parameters in this it says to set is the -Xingc switch, now I didn't know what this was but I've since done quite a bit of reading on java GC and it seems to me on a multi-processor system I shouldn't be setting this or am i wrong on this? (I could very well be as I'm a relative novice at JVM tuning)

kolonell · 15 Sep 2009 05:55:49 AM

Quoteone of the parameters in this it says to set is the -Xingc switch,

Never heard about that switch before (and google neither apparently). Can you forward the link ? One can never know enough ;-)

jmwhitehead · 16 Sep 2009 07:01:58 AM

This is the link to the doc on the ibm support site :-

http://download.boulder.ibm.com/ibmdl/pub/software/dw/dm/cognos/performance/cognos_specific/java_garbage_collection.pdf

and here's a sun doc about tuning GC which

http://java.sun.com/docs/hotspot/gc5.0/gc_tuning_5.htm

Since yesterday I've taken the param -Xincgc out of the bootstrap files and the javavm are now using the throughtput GC instead of the concurrent low pause collector which that parameter activates, as well as increasing the VM memory size to 1536Mb everything seems to be going okay at the moment.

BSP Software	Resources	About Us
MetaManager	BSP Software Training	BSP Software
Integrated Control Suite	YouTube Channel	Micro Strategies Inc
Security Migration		IBM Cognos
Integrated Management Suite

If you are unable to create a new account, please email support@bspsoftware.com

COGNOiSe.com - The IBM Cognos Community

News:

Java Garbage collection causing report server slowdowns

jmwhitehead

kolonell

jmwhitehead

kolonell

jmwhitehead

kolonell

jmwhitehead