If you are unable to create a new account, please email support@bspsoftware.com

 

News:

MetaManager - Administrative Tools for IBM Cognos
Pricing starting at $2,100
Download Now    Learn More

Main Menu

Cube Scalibility

Started by ComplexQry, 15 Jul 2010 03:08:37 PM

Previous topic - Next topic

ComplexQry

I have a cube that I am concerned with how well it is going to scale.  The requirement is to have 5 years of data.  Each month has ~45 million records.  Do the math, we should be around 2.7 Billion Records.  I anticipate a reasonable amount of categories.  For one month it was around 100,000.  But this number shoudn't grow drastically outside of the time dim categories.

I guess I am looking for opinions of how this should scale once it is a full 5 years of data.  The plan is to incrementally build.

Also there is the potential to partition this cube by 13 geographic regions.  What would the downside be to doing this?

twlarsen

Simple solution...don't use cognos for that amount of records ;)

How many dimensions are you going to be using?

Whats the largest cube your organization uses right now?

ComplexQry

12 Dimensions.  Many of those dimensions are simple 1 level dimensions.  I am really not concerned about the category counts.  Moreso just the volumn of data.  I would anticipate 500k categories at most.  If not Cognos, then what?  Would TM1 be a better fit for this?

ComplexQry

Ok, the project for this effort is kicking off soon...  And I know it will push Powerplay to its limit.  So I need some help with combating a cube that is exceeding 2Gb.  I have read on some posts that this isn't really a hard limit restriction.  But assuming I modify the query to be as efficient as possible, but still hit that 2Gb with ease, what other alternatives are there besides reducing the required 5 years of data?  Again, Category Counts are good, Record Volume is the thorn.

RobsWalker68

Hi

Handling the 2Gb limit per mdc is easy enough to solve you just need to change the MultiFileCubeThreshold setting in the cogtr.xml file to enable creation of multiple local PowerCube files.

However, you are still going to have issues with the volume of data and potentially the category limit which used to be around 2 million, but you will need to check that out further. 

Personally from experience even if you get the cube to build with five years worth of data the performance from a user perspective of a disk integrator such as Transformer is going to be woeful.  You may need to be looking at other products such as Cognos TM1 that could potentially handle that volume of data.

Presumably the 45 million rows per month are transactional records.  Have you looked at pre-summarising in the database first before handing off to Transformer and then using Report Studio to provide access to the detailed transactions?

You could create mini cubes for your geographical regions although you need to consider how you would present a company wide picture if need be.

Kind Regards


Rob



ComplexQry

Thanks for the advise.

My current state is that I am using Time-Based Partitioning by Year.  So in this case I have 5 cubes, one for each year.  A full year comes out to be about 1.4GB and about 50k Categories.  So I am very satisfied with the Category Count.  So I will need to conduct some performance testing of using this data across years.  The typical behavior would be "This is my number today, how was it in previous years?"  So I can see heavy analysis in that current year cube, but then referring back to previous years.  I am quite a bit more optimistic today at least.  :)

nmcdermaid

It sounds like you have the cube side of it totally in hand.

Assuming you don't have lots of new categories, record volumes only add to build time

There's a document floating around called 'Transformer in production'. Most of which is says:

-have a dedicated cube build server (i.e. seperate to database, and most importantly have its disks seperate to the database)
-have different physical disks for the various temp stages of the cube build
-partition your cubes and only build cubes with changed data (unfortunately that also means if you get a new account category you need to build all cubes)
-pre summarise in the DB where possible