If you are unable to create a new account, please email support@bspsoftware.com

 

News:

MetaManager - Administrative Tools for IBM Cognos
Pricing starting at $2,100
Download Now    Learn More

Main Menu

Multiple Cubes vs One Cube

Started by bloggerman, 20 Aug 2010 04:13:39 PM

Previous topic - Next topic

bloggerman

I have had some transformer experience, but, haven't come across creation of multiple cubes in a project. I have heard of splitting of cubes and creating multiple cubes rather than having everything in one cube. Could someone help me understand the criteria, business and technical, to decide to have multiple cubes rather than only one cube.

I would guess size, build time and ease of use through number of dim/measures might be some. If so, are there some standard numbers that act as tipping points.

Also, how do we split cubes? Do we still retain only one model?


CognosPaul

The decision to work with a single or multiple cubes depends primarily on your requirements. I've got one cube on MSAS that has 15 dimensions, over 100 hierarchies and a couple dozen measures. It would have been just as easy, if not easier, to maintain everything in several cubes.

Generally a major project on powercubes will need to work with several cubes. There is a size limit to powercubes (2 gigs I think), and there are several deficiencies with regards to the actual structure of the cubes that can cause problems.

The pros of having a multiple cubes means that you're spreading the load. It means reports will run faster and generally be more efficient.

The downside comes when you need to merge the cubes into a single report. You'll have to find ways of slicing all of the cubes with a common prompt. Merging data from multiple cubes can be difficult.

Whenever I work with multiple cubes I try to maintain a common model. This is not mandatory, as different cubes may have data that does not have enough in common to model it relationally.

At the start of your project it is essential to understand what kind of reports you are going to produce. Look for groups of reports with similar data. Build a cube for each group. Make a package that has all of the cubes included for those odd reports that need data from all of them.

cognostechie

Yes, size, build time and business requirements usually govern the decision of making multiple cubes though I am not sure if 2 GB is actually a limit for Powercubes or the Operating System for a single file size. I know UNIX has a 2 GB limit for a single file-size and there is a BIGFILESIZE variable that overcomes that.

You can make multiple cubes from the same Transformer model and when it builds the cubes, it builds in one pass, not multiple passes. Right click on the cube inside the model and go to Properties, then Cube Groups, you can select which level of a dimension you want to use for making multiple cubes.

Ex: You want to give the Salesreps their own cubes so that they see only their data, you can make cubes based on Salesrep Code which could be a level in some dimension.

peritas-chris

Are you talking about partitioning large cubes into smaller cubes?  Usually this is done based on time periods, and is handled natively by Transformer.  End users are not aware of the partitions.

Summary cubes can also be configured to drill through to more focused, detailed cubes.  This is a good practice for financial cubes that may serve both high level consolidated statement purposes AND detailed expense analysis down to transaction level detail.

Let me know if I can clear this up further for you, feel free to message me.

bloggerman

Hi

Thanks for all your responses. You guys are great.

No, i wasn't referring to Partitioning of cubes, but, to different physical mdc files.

Say for example my project has the following dimensions
Region, Product, Time,Customer type, etc.(say 5-6 more)...
Now if i build a cube with all dimensions cube size might be huge, impacting performance...If i have set of reports which are only based on Region, Product and Time then i could try and build a mdc with only these 3 dimensions and have other cubes at the side as well. Would that make sense?  How would i do so? Is there a better way to handle it?

Cube group would give a cube for a category at a level in a dimension. In our example above, i could get a cube for US, but, that would be different from what i am refering to above (including only a few dimensions for a mdc). Or am i missing something?

redmist

you could create multiple cubes (mdc's) from a single transformer model.
In the Model, add/create multiple Powercubes and in each of them include only those Dimensions that are required. Exclude the rest of the dimensions bu creating custom views.

Following is my understanding about the performance impact and this might not be entirely accurate.
This way i believe the excluded dimensions will not be included in the cube thus reducing the size. Database will be queried only once in this method and so your build time will not increase significantly

bloggerman

Phew....I think i got it. Thanks guys.

Could you please state some examples of the deficiencies you mention in your comment
"there are several deficiencies with regards to the actual structure of the cubes that can cause problems. ". This is part of the first reply above by Paul.

Do we have one package that has all cubes or cubes of unrelated
reports are in different packages? I would think if we have all cubes in one package then the reports would be slow. But, then if we have in separate packages then we won't be able to report across cubes. Could someone please help.

CognosPaul

To be more specific with the deficiencies - I was referring to powercubes.

These are just off the top of my head.

To start powercubes have four attributes per level. Long Name, Short Name, Description, and Category Code. Long Name and Category Code should be filled out to begin with, leaving you with two attributes to work with.

There is a maximum size of 2gigs, which may be sufficient for small databases but prevents it from scaling up.

Powerplay has issues dealing with many-to-many relationships.


bloggerman

Do we have one package that has all cubes or cubes of unrelated
reports are in different packages? I would think if we have all cubes in one package then the reports would be slow. But, then if we have in separate packages then we won't be able to report across cubes. Could someone please help.

I have heard someone mention that they had a standard of one package one cube, which seems odd.

peritas-chris

Serving up multiple business related cubes via one Framework Manager deployed package should have no impact on report performance as far as I know.

MFGF

Quote from: bloggerman on 26 Aug 2010 11:14:08 AM
Do we have one package that has all cubes or cubes of unrelated
reports are in different packages? I would think if we have all cubes in one package then the reports would be slow. But, then if we have in separate packages then we won't be able to report across cubes. Could someone please help.

I have heard someone mention that they had a standard of one package one cube, which seems odd.

The biggest issue with publishing multi-cube packages is that there is scope for non-technical authors to get things horribly wrong.  Cognos 8 does not support addressing multiple cubes within the same query, so authors in Query Studio and Analysis Studio would need to be careful not to bring in items from different cubes in the same report/analysis, resulting in nasty error messages.  Authors in Report Studio would need to be aware that they would need a separate query for each cube they wanted to include data from - then most likely link these queries via a master/detail relationship.  Obviously this would require more proficient authoring skills than many business users possess.

Just my tuppence!

MF.
Meep!

bloggerman

Somehow i come up with one more.

This question came up when i tried to join this discussion with the one we had in another thread.

1) We could break a cube into a Summary and a detail cube, but, we could as well split it into pieces, which we have been talking about here. How should i go about deciding which would suit me better. I would think both would be addressed at improving performance.  Summary/Detail option would provide all the measures/dimensions through one cube. Is that the only reason for voting in favour of Summary/Detail?

2) I am at the start of design phase in my project. I am at the point where i need to decide if i should split or not split cubes. I have fact tables and dimensions(16) lined up. As everyone suggested above, I have logically grouped my reports based on what functionality they cater to and what tables they access. I can go ahead and split based on those. But, then i was wondering if i need to. If that single cube could give me expectable performance or not breach 2GB.

So, do you start by building a single cube and if it under-performs/breaches then think of splitting or can we guesstimate based on certain parameters .



bloggerman

Any thoughts on the last query. Thanks.

nmcdermaid

You at least need to build a cube, then issues will start to come to light. You need to actually have a business or technical reason to break the cube up. Otherwise don't bother. Generally performance is a non issue of you don't have flat dimensions.

You can split cubes because they have a different business purposes - i.e. there is not necessarily any point in having 'cars' and 'planets' in the same cube if there is no meaningful way to associate them.

You might have a financial cube which has some HR elements in it and you might have a HR cube which has some financial elements in it. But you are unlikley to have a cube which has every HR dimension and measure and every financial dimension and measure. It might provide some insights but mostly it will just confuse because various measures and dimensions won't mix.

You can have one model which has all financial and HR dimensions and measures in it. Then in the cube definition part of the model you can split this into two cubes with a subset of dims/measures. That is a techinical possibility but I've never found it very practical personally.

PaulM, I have to argue with some points.

Certainly lack of attributes is an issue in powercubes. But the 2G limit is easily overcome by spreading cubes over files. (I believe once upon a time it was an an OS limit) You overcome many to many relationships in Transformer simply by modelling them correctly.