If you are unable to create a new account, please email support@bspsoftware.com

 

Java heap size error related to search index (v11.1.5)

Started by yphilogene, 08 Jul 2020 12:30:15 AM

Previous topic - Next topic

yphilogene

As I spent several nights depressing on this issue, I would like to share it now that it's solved.

We migrated from Cognos 10 to Cognos 11(.1.5) some weeks ago. After 1 week, we started to notice some instabilities. After 2 weeks, it was impossible to start the service.

When we try tried to start the service, the Java.exe process was taking a lot of memory until the creation of a javacore log.file indicating "Java heap size" error + generation of a core DMP file (corresponding to the snapshot of the Java heap at the moment of failure).

After some investigations with the Help of IBM Cognos Support (always log a ticket to the Support by the way), we managed to understand that there was an issue with the "search" index.

During the boot of the IBM Cognos service, the java.exe was taking an awful amount of memory while trying to "build" or "sync" the Search index files.

We applied the following procedure:
1) Stop Cognos <-- This has to be done as we are going to delete some files

Check that there isn't any java.exe process running anymore.

2) Backup/zip and delete the content of some folders:
- Folder <Cognos>/data/search <--- This folder contains search index files
- Folder <Cognos>/temp <--- Not really related to the issue but let's start clean
- Folder <Cognos>/logs <-- Not really related to the issue but will help to follow the steps during the start of the service

The backup/zip of this folder is.just a precaution. Once backuped, you can delete the content of the mentioned folders.

3) Open Cognos Configuration and setup the max size JVM Websphere Liberty Profile so that it has enough memory to recreate the Search index. The default value in Cognos 11 is 4Go or 8Go, but if you have enough RAM on your server, do not hesitate to put a higher value. For example, I had 32Go RAM on my server, so the IBM Cognos Support asked me to put 24Go (12*1024).

Save the configuration, then start the service.

You should see the java.exe process starting to use more and more memory.

In the cognoserver.log you should at some point see the start of index sync. At this point, check the folder <Cognos>/data/search <--- There should be a subfolder called "collection" and the size of this folder should start to grow.

The Search index/sync may take some time <--- Don't panic

It may even continue even after IBM Cognos service is started <-- Wait until you read in Cognos server.log that the index creation is finished (it should say how many objects and how.luch time it took).

Of course, check if the java.exe managed to create the index without reaching the max size JVM for Websphere Liberty Profile. For example, I set it up to 24Go, but in the end the creation of the index "only needed" up to 11Go.

Once the Search index are created, the problem is solved (well, at least this one).

You can then stop Cognos again, set a smaller value for max size JVM for WLP, save and start Cognos again.

This time, Cognos won't try to create the Search index again as they already exist.


So, in summary, this process consisted in recreating the Search index.

As for why I ended up with this issue, it's hard to tell. Maybe it's because I did several content store imports during the migration project and never cleaned the data/search folder. Maybe some old index files were disturbing the sync and creating a java leak issue.



Please also note that an Advanced Content Manager setting exists so as to "control" a little bit the tempo of indexing (for new reports/objects created). Setting up this property may help, but didn't solve my issue. I had to delete the data/search folder and have the index recreated.

I hope this post will avoid some of you spending long nights on this kind of issues.