Memory leaking in Cognos 8.3 BI

You know why i’ve started this blog in the first place? To get advice, of course. And today DesiCresnet shared a wonderfull piece of information concerning memory troubles in BI 8.3.

Those who aren’t struck by occasional CAM-AAA-0071 “An internal error occurred” which stops the whole server can skip this post.
This error was discussed at:

http://www.ibm.com/developerworks/forums/thread.jspa?threadID=244357

http://www.cognoise.com/community/index.php?topic=4971.0

The story goes like this:
- java.exe (that powers BI), grows up to the point where it cannot obtain contigious memory from server and crashes
- Cognos BI doesn’t recognize this error, so it throws out a general CAM-AAA-0071
- Cognos BI is unaccessible until restart

This error occurs more often in heavy-usage enviroments. In one of our projects, with lots of Event triggered admin-links and reports, it took only a coulple of hours before it striked.

Solution is not complete, but as I see it now:
1) Put the latest SP on BI (or a special hotfix, if one is avalaible)
2) Lower the avalaible memory for Cognos BI service (to 768 mbs)
3) Modify the HeapDecommitFreeBlockThreshold registry property

It’s interesting if 8.4 is prone to such error.

PS: I’m a certified Cognos Technical Specialist now, as well. Need to get first-line support partnership, you know )

  • DesiCresnet

    BI 8.2 also has this issue. The same hotfix we put in for 8.3 worked for 8.2 and is required to put in only on Content Managers. Vanilla 8.4 has this issue too.
    The issue is CAMAAA component which makes JNI calls to LDAP causes this. The issue is still not completely gone. Heap setting and patch delays the crash for 1 week instead of 1 hour on peak concurrent usage.

    Regarding your reducing the 768MB settings : Be careful. It may may make system unstable. Please turn on Garbage collection and analyze the heap (using IBM Garbage collector). Turn on for a two to three weeks. see if you are making use of 768MB. According you can increase or reduce the heap. Remember, you are delaying the leak. Not correcting. On the same lines we have seen issues with Webservices server (on a rolebased architecture). Content Manager and Planning Webservices are serious weak links.We are surviing by just putting bandaids.

    Congrats on your certifications. I finsihed BI Modeler. About to finish Planning ones..

    BTB did you ever tried decrypting those XMLBLOBS in Planning tables.?
    I had success in PeopleSoft Tables to decrypt to BASE64. I am not sure about encryption Mechanisms in planning tables.

  • DesiCresnet

    Forgot to mention 4th point to bandaid the situation
    http://www-01.ibm.com/support/docview.wss?uid=swg21341959

  • Sachin J. Kulkarni

    We faced this CAM error as well. We are using Access Manager and are thinking this may be possible suspect for the the failure to happen. One of the reasons being several hundred logons generated for users in Audit Tables.

    Can you tell us whether you were using Access Mgr or not?

  • ykud

    Hi Sachin.
    We weren’t using Access Manager.

  • Sachin J. Kulkarni

    I am not sure if you have auditing turned on in your environment. If you do, do you see several hundred logon for users? If you actively use it, may be check for your id. For an active user, who runs macros/events we see about 2k to 3k logons in an 4 to 8 hr window! And each of the logon is alomst a minute after the previous one.

  • DesiCresnet

    Sorry Sachin…gone to India…for a vacation. Those 2K, 3K logon are for Scheduler credentials…no harm.. when you run Jobs thru macros…those will get logged…