If you are unable to create a new account, please email support@bspsoftware.com

 

News:

MetaManager - Administrative Tools for IBM Cognos
Pricing starting at $2,100
Download Now    Learn More

Main Menu

BIBusTKServerMain Memory Issues

Started by juetron, 05 Apr 2007 06:43:10 AM

Previous topic - Next topic

juetron

Cognos Version : 8.1.209.90
OS : AIX 5.2
App Server : WebSphere 5.1
Env Details : Distributed. 8 servers.

We are seeing some unusual behaviour regarding the BIBusTKServerMain processes on our Production environment. The evidence suggests that the BIBusTKServerMain processes are leaking memory. It starts when users start to receive the following errors :

xxx.x.xx.xx:9310        446654  2007-04-05 10:53:03.967 +0      86579Df63274f9B0101124BF65A8fAB4915DC0DC        wwClCsd28CC24lhywldsqlwwh9GMqMjq4jy94y22        98yqCMj9Gwhdlqldwwv4lM2M8lsldlC4sqjMyl4w                515     RSVP    1365    1       Audit.RTUsage.RSVP      Response                        Failure RQP-DEF-0177 An error occurred while performing operation 'sqlOpenResult' status='-28'.UDA-SQL-0114 The cursor supplied to the operation "APICursor::OpenResult" is inactive.UDA-SOR-0001 Unable to allocate memory.

Note the PID of 446654. If you review the Cognos server log, it will show that that single BIBus process will be the source of all the error messages. If you list the processes running on the server, you may find that the problem BIBus process has been running the longest :

wasadmin  323696  999634   0 12:53:29      -  0:43 /usr/cognos/LASCOGRS2UK/bin/BIBusTKServerMain threads=10 camssl=false COG_ROOT=/usr/cognos/LASCOGRS2UK cam=true idleTimeLimitSec=900
wasadmin  446654 1376322   0 21:09:23      - 168:38 /usr/cognos/LASCOGRS1UK/bin/BIBusTKServerMain threads=10 camssl=false COG_ROOT=/usr/cognos/LASCOGRS1UK cam=true idleTimeLimitSec=900
wasadmin  561210 1376322   3 04:43:11      - 69:48 /usr/cognos/LASCOGRS1UK/bin/BIBusTKServerMain threads=10 camssl=false COG_ROOT=/usr/cognos/LASCOGRS1UK cam=true idleTimeLimitSec=900
wasadmin  901142 1376322   0 12:49:28      -  1:09 /usr/cognos/LASCOGRS1UK/bin/BIBusTKServerMain threads=10 camssl=false COG_ROOT=/usr/cognos/LASCOGRS1UK cam=true idleTimeLimitSec=900

And finally, if you run a ps aux xxxxxx command against the BIBus process PID, you will find that it is using a large amount of memory. Far more than the other BIBus processes :

wasadmin  323696  0.0  1.0 406684 138060      - A    12:53:29  0:43 /usr/cognos/LAS
wasadmin  446654  5.1 13.0 2484512 1918664      - A    21:09:23 168:12 /usr/cognos/LAS
wasadmin  561210  4.7  7.0 1106424 1036456      - A    04:43:11 68:55 /usr/cognos/LAS
wasadmin  901142  0.0  1.0 345220 116508      - A    12:49:28  1:09 /usr/cognos/LAS

In this case it is dispalyed as a percentage. 13% of the available memory is approximately 2.4Gb. We have seen on other posts that C++ processes cannot occupy more then 2.4Gb of memory - depends on OS tho. But that would be consistent with the problem BIBus process refusing to respond to any further user requests.

The only way to stop the errors coming out is to restart the JVM or kill the individual BIBus process. Cognos Support are engaged but have yet to find a resolution.

Has anyone else experienced or experiencing this issue ?






COGNOiSe administrator

Try to create two Environment Variables in Server Administration:

DISP.BatchProcessUseLimit
DISP.InteractiveProcessUseLimit

Set them to 10 or 20, to recycle a dispatcher every N requests. Let me know if it helped to resolve the issue.

juetron

Thanks for the suggestion. So these two properties control how many times a connection can be retrieved from a report server process. Which roughly translates to how many 'executes' and 'page up/downs' can be done for the process.  So a complete report execution is counted as 1 request.  Other, simpler requests like page down, get output, etc are also counted as 1 request.

Once the use limit has been reached, no more non-affine requests will be routed to the process.  When the idle limit is exceeded for this expired process, it will be destroyed. And the process will be destroyed based on the idle process check interval located the reportservice.xml file.

Based on this information :

a) Do these settings need to be set on each Report Server?
b) Our daily usage peak runs at somewhere between 800 and 1200 report server requests per minute. If we force the number of requests each report server process can handle to only 10 or 20 what will be the impact on server resources in terms of having to stop/start far more BIBus processes? If i'm understanding this correctly, at a level of 10, that could mean between 80 and 120 BIBus processes being stopped/started each minute.
c) I'm assuming we would need to restart the individual report service to pick up any changes. In our distributed environment that has to be performed at the JVM level rather than the Admin console (you can't stop individual services via the admin console in a distributed AIX env)
d) Perhaps it would be more suitable to put the setting at a higher value and decrease it overtime whilst monitoring the environment? I think Cognos recommedn a value of 500.

COGNOiSe administrator

If you group your servers in folders, then you need only to change it once and it will affect all servers in that folder equally. In your scenario, 10 might be low, try a 100 or 250 then. Find your sweetspot where performance is acceptable and no more errors occur.

Are you saying that in AIX you cannot restart a service through Server Administration? Bummer!

juetron

We do have another call open with Cognos regarding restarting services in the Admin console. In our configuration (which has been approved by Cognos) if you attempt to stop, for example, a report service the entire environment becomes unresponsive. Using one of our monitoring tools, you can see that the WebSphere threads suddenly increase and hit the maximum number of permitted threads for the server. Everything hangs and you have to restart at the JVM level to recover the service. Most annoying....

Will look into those settings and post the results. Cheers.


COGNOiSe administrator

Keep us posted if you can. Bugs like these cause the most sleepless nights!