If you are unable to create a new account, please email support@bspsoftware.com

 

System flooding with Fail Requests on Presentation Service

Started by paulpeetersnl, 21 Feb 2019 03:54:37 AM

Previous topic - Next topic

paulpeetersnl

August 2018 we migrated our 5 Cognos-environments from 10.2.1 to 11.07  On this moment we are on 11.0.11
After this migration we encounter the same issue, on all our environments:
At any moment there is a sudden rise of Failed Request on the Presentation service - which leads up to 200.000 failed requests per hour, 24/7
As a result, at some time, the rate of Failed Request doubles en floods the CPU of the Content Manager Server. The only way to stop this is to stop Cognos and restart the system

Since september IBM tries to solve this problem, but so far they are stucked with our Hardware and the question if we are pulling enough resources - to us , this isn't the real issue

Our environment looks like:  Content Manager and dispatcher 1 on one server  and dispatcher 2 on a different server

The issue it self:
We can't find any correlation between the issue and: a user - a report  - a package - a datasource - a Dynamic cube - a environment - a certain time  or whatever.

The only correlation we found, so far:
When this issue occurs, we found in the metrics - PresentationService:
The likely start of this issue is:  Current time  minus 'Respons Time high Watermark'

At this exact moment we find an error in the Cogserver.log which includes: '..Failure   RSV-SRV-0066 A soap fault has been returned..  '
This happens EVERY time this issues evolves, so far dozens of time

Furthermore: The SESSIONID, from the above error in cogserver.log, is from a SESSION which has expired long before (sometimes a few hours, sometimes almost a full day)
This happens also EVERY time, an expired SESSION which seems to come alive..

We also noticed that we have quite a number of Failed logons, on the same SESSIONID, shortely after a succesful Logon, on the same SESSIONID?

Does this ring a bell with anyone?




rd152343

Hi,

Are you using custom authentication through CAP code?
We were facing similar issue, we have to change our Dispatcher URI's for Gateway, point to bi/v1/disp and not to p2pd/servlet.
And we made changes in Dispatcher URI for external applications too.

Please let me know when you are able to resolve the issue.

Thanks!

paulpeetersnl

Hi,

Thx, we will discuss this with our technical cognos support..

Did you made a call of this with IBM?

Thanx so far, i wil keep you noticed

Paul Peeters

paulpeetersnl

One more question

What do you mean with CAP Code?

Thx, regards Paul Peeters

MFGF

Quote from: paulpeetersnl on 05 Mar 2019 09:41:27 AM
One more question

What do you mean with CAP Code?

Thx, regards Paul Peeters

I think it stands for Custom Authentication Provider. I have also heard it referred to as CJAP where the provider code is written in Java.

Cheers!

MF.
Meep!

rd152343

Hi Paul Peeters,

Yes, CAP stands for Custom Authentication Provider.

Our server team did not modify the GATEWAY/DISPATCHER URI configuration, as IBM has changed the URI references after 11.0.3. As a result, there was heavy load on Content Manager, with lot of request failing.

Once we modified, we dont have this issue now.

Please do post us on your findings.

Thanks!

paulpeetersnl

Update:

Since March, 13 we are on Cognos 11.0.13.1  After this update we haven't encountered any issue regarding the failed requests, so far

To short to be absolutely sure, but we are hopeful

With regards, Paul Peeters

gohabsgo

@paulpeetersnl, was one of the symptoms that users were not able to login successfully during the time period this was happening? 

We're seeing some wierd things with cjap on 11.0.6 where Cognos loses reference to the user (they show as unavailable) and some users cannot login successfully.  Once we cycle the service, it re-syncs itself and users can get in no issues.

If we run consistency check during this odd time then the CM reports they are all orphaned (even though they're not).