If you are unable to create a new account, please email support@bspsoftware.com

 

News:

MetaManager - Administrative Tools for IBM Cognos
Pricing starting at $2,100
Download Now    Learn More

Main Menu

Large cubes

Started by etlmaster, 24 Oct 2006 12:55:25 PM

Previous topic - Next topic

etlmaster

Does anyone out there - besides me - have experience with large cubes (approaching 2GB)?  If so, how did you structure the cube(s) to balance the wonderful combination of build time and cube performance?

Opher

I have five or six cubes that are larger than 2GB in total.   Generally, I have left the trade off at default - take longer to build so that access time is reduced.  I then schedule to builds to occur late Saturday night and just let it crunch away.

Good luck,
Opher


Darek

There is really not much you can do to balance. Various attempts have been made at time partitioned cube by my business partner, Andrew Rachmiel, who's article about building 64 time periods in parallel you may find on the SupportLink magazine website.

At a large mid-west bottler of a refreshing drinks, we've helped to design and implement a process of building 100+ cubes nightly, ranging in size from 250MB to over 4GB. It takes us 6 hours, to execute 24 builds in parallel, that are a mix of direct DB builds, as well as custom flat-file extracts, to ease of the load on an Oracle DB.

At another customer, we've built cubes that are 10GB+ with decent performance, which was only possible to achieve by leveraging manual cube partitioning.

My advise to you is, throw some decent hardware at the problem, mainly with fast CPUs and SAS drives behind a decent controller. And don't use blades, as they do not perform well. Use flat files, data marts or materialized views, where it makes sense.

Skidly


We are getting a new Transfomer box. 
I suggested what Darek suggested.

Stand alone box, duo core cpu, SAS drives, decent controller etc.

Now I am being asked to provide details as to why we can't go with:
a Blade server, dedicated drives (no SAN), duo core.

Basically a beefed up Blade instead of a cheaper Blade or stand alone server.

Any insight as to why we shouldn't go with a Blade would be extremely helpful.
I realize this could be to broad of a question.

Thanks,
Chad Hult

COGNOiSe administrator

Yes, Chad. It is a broad question since not all blade servers are equal. I myself don't like them since in my opinion every component that has to be stuffed into a blade, like disks, RAM, controllers, etc; will never be as fast as a dedicated equivalent.

The IT industry seems to be going through waves: let's virtualize everything!, that didn't fly; let's outsource everything!, that didn't work either; let's "blade" everything ... we are yet to see if this technology has survival skills.

Skidly


ComplexQry

I am not familiar with manual partitioning, but I have a cube that is nearly 2g in size.  But I think our real issue is the amount of categories we have.  We have nearly 30 dimensions.  This cube is basically performing as an "easy" way to get at the data in our datamart.  We have tried to create a few PPX Reports and they continue to bomb.  Nesting a dimension with 10k categories with another dimension that is 1.5 categories with a couple more.  Yes I could create another dimension with that hierarchy, but there is countless possibilities for that...  I need to make sure that I have done everything possible to optimize the cube.

So...

Where can I go to learn about Manual Partitioning?
What other things can be done to increase performance?

COGNOiSe administrator

Sounds like some of your users treat PP as a great source for Excel reporting ... To quote Donald The Duck "Oh, phooey!"

There is very little guidance available for Manual Partitioning. You have to look which levels or categories are used the most, as well as at the current partitioning model, and make sure that your categories are evenly distributed accross all partitions. Other options woud be Consolidate with Sort, Enable Crosstab Caching, increase Read Cache Size, or opt for time-based partitiong.

ykud

And a simple question: Windows vs Unix?
For example, I can get a very decent Sun server and a nice stand-alone Wintel one.
What to prefer for Transformer Processing? Risc?

I understand that without technical specs it's rather pointless, but anyway, what would you choose?

COGNOiSe administrator

My preference is Windows. Always Windows.

ykud


vivek_vdc

Can you all provide me with the server specs that you all use to build these large cubes? We are operating Transformer on a 4 GB RAM Windows Server 2003 that reads data via IQDs from a database server with the same amount of RAM 4 GB. Would you advise that we move the data from tables to flat files for faster processing? We have the read and write cache size in config manager set to 65 K. How about you folks? Just wanted to get an idea of how others have their hardware / cache sizes set up for cube builds.