Large cubes

etlmaster · 24 Oct 2006 12:55:25 PM

Does anyone out there - besides me - have experience with large cubes (approaching 2GB)? If so, how did you structure the cube(s) to balance the wonderful combination of build time and cube performance?

Opher · 28 Nov 2006 03:36:51 PM

I have five or six cubes that are larger than 2GB in total. Generally, I have left the trade off at default - take longer to build so that access time is reduced. I then schedule to builds to occur late Saturday night and just let it crunch away.

Good luck,
Opher

Darek · 04 Dec 2006 01:02:37 PM

There is really not much you can do to balance. Various attempts have been made at time partitioned cube by my business partner, Andrew Rachmiel, who's article about building 64 time periods in parallel you may find on the SupportLink magazine website.

At a large mid-west bottler of a refreshing drinks, we've helped to design and implement a process of building 100+ cubes nightly, ranging in size from 250MB to over 4GB. It takes us 6 hours, to execute 24 builds in parallel, that are a mix of direct DB builds, as well as custom flat-file extracts, to ease of the load on an Oracle DB.

At another customer, we've built cubes that are 10GB+ with decent performance, which was only possible to achieve by leveraging manual cube partitioning.

My advise to you is, throw some decent hardware at the problem, mainly with fast CPUs and SAS drives behind a decent controller. And don't use blades, as they do not perform well. Use flat files, data marts or materialized views, where it makes sense.

Skidly · 01 Jun 2007 09:49:26 AM

We are getting a new Transfomer box.
I suggested what Darek suggested.

Stand alone box, duo core cpu, SAS drives, decent controller etc.

Now I am being asked to provide details as to why we can't go with:
a Blade server, dedicated drives (no SAN), duo core.

Basically a beefed up Blade instead of a cheaper Blade or stand alone server.

Any insight as to why we shouldn't go with a Blade would be extremely helpful.
I realize this could be to broad of a question.

Thanks,
Chad Hult

COGNOiSe administrator · 01 Jun 2007 06:07:44 PM

Yes, Chad. It is a broad question since not all blade servers are equal. I myself don't like them since in my opinion every component that has to be stuffed into a blade, like disks, RAM, controllers, etc; will never be as fast as a dedicated equivalent.

The IT industry seems to be going through waves: let's virtualize everything!, that didn't fly; let's outsource everything!, that didn't work either; let's "blade" everything ... we are yet to see if this technology has survival skills.

Skidly · 05 Jun 2007 10:43:24 AM

Thanks Darek!

ComplexQry · 11 Jun 2007 10:05:50 AM

I am not familiar with manual partitioning, but I have a cube that is nearly 2g in size. But I think our real issue is the amount of categories we have. We have nearly 30 dimensions. This cube is basically performing as an "easy" way to get at the data in our datamart. We have tried to create a few PPX Reports and they continue to bomb. Nesting a dimension with 10k categories with another dimension that is 1.5 categories with a couple more. Yes I could create another dimension with that hierarchy, but there is countless possibilities for that... I need to make sure that I have done everything possible to optimize the cube.

So...

Where can I go to learn about Manual Partitioning?
What other things can be done to increase performance?

COGNOiSe administrator · 13 Jun 2007 01:02:18 PM

Sounds like some of your users treat PP as a great source for Excel reporting ... To quote Donald The Duck "Oh, phooey!"

There is very little guidance available for Manual Partitioning. You have to look which levels or categories are used the most, as well as at the current partitioning model, and make sure that your categories are evenly distributed accross all partitions. Other options woud be Consolidate with Sort, Enable Crosstab Caching, increase Read Cache Size, or opt for time-based partitiong.

ykud · 18 Jun 2007 08:40:27 AM

And a simple question: Windows vs Unix?
For example, I can get a very decent Sun server and a nice stand-alone Wintel one.
What to prefer for Transformer Processing? Risc?

I understand that without technical specs it's rather pointless, but anyway, what would you choose?

COGNOiSe administrator · 21 Jun 2007 07:47:30 AM

My preference is Windows. Always Windows.

ykud · 21 Jun 2007 07:52:17 AM

Quote from: COGNOiSe administrator on 21 Jun 2007 07:47:30 AM
My preference is Windows. Always Windows.

Thanks.

vivek_vdc · 10 Jul 2009 09:52:56 AM

Can you all provide me with the server specs that you all use to build these large cubes? We are operating Transformer on a 4 GB RAM Windows Server 2003 that reads data via IQDs from a database server with the same amount of RAM 4 GB. Would you advise that we move the data from tables to flat files for faster processing? We have the read and write cache size in config manager set to 65 K. How about you folks? Just wanted to get an idea of how others have their hardware / cache sizes set up for cube builds.

BSP Software	Resources	About Us
MetaManager	BSP Software Training	BSP Software
Integrated Control Suite	YouTube Channel	Micro Strategies Inc
Security Migration		IBM Cognos
Integrated Management Suite

If you are unable to create a new account, please email support@bspsoftware.com

COGNOiSe.com - The IBM Cognos Community

News:

Large cubes

etlmaster

Opher

Darek

Skidly

COGNOiSe administrator

Skidly

ComplexQry

COGNOiSe administrator

ykud

COGNOiSe administrator

ykud

vivek_vdc