If you are unable to create a new account, please email support@bspsoftware.com

 

News:

MetaManager - Administrative Tools for IBM Cognos
Pricing starting at $2,100
Download Now    Learn More

Main Menu

Incremental cube update - Pros & Cons

Started by gicy, 20 Oct 2009 10:01:16 AM

Previous topic - Next topic

gicy

Incremental cube update - Pros & Cons

Environment :
IBM Cognos 8.3

Data Flow for cube update process:
DWH(Data Warehouse) ->Data Mart->Package->Transformer(Create Cube and Package)->Create report

As shown above, we are building a cube using package that pulls data from the data mart.
The initial population of fact table will be approximately 230,000 rows.

I have built a cube. The model (.mdl) size is around 14 MB and cube (.mdc) size is around 21 MB.

Data Mart fact table will be populated on a daily basis. This requires us to refresh the cube on a daily basis too.

Consider this scenario :
Day 0 : Data Mart fact table has   230,000 rows. Cube is created by processing all these records. Total cube update time is 2 hours.
Day 1 : 15 additional records got inserted in data mart fact table. so the current count is  230,015. Refresh the cube. It processes all the records even though only 15 new records have been added.

and so on...

By using non-incremental process, existing records are unncessarily re-processed and hence increases the cube build time.

The only way to bring over the new data is to incrementally update/refresh the cube.

I have following questions ;

1. As per cognos transformer documentation, for incremental update to work, data source should have only the new data.  This requires lot of work on ETL work. Is there any way we can handle this at Cognos model level (Framework Manager model or Transformer model).
2.What are the main disadvantages or difficulties of having incremental cube update.
3. Can any one share the best practices or their strategy to do incremental cube update?
4. Will I be able to batch file to update this cube using incremental approach ?

Thank you in advance!!!

nirmal8186

Hi
I have done this at the Frame Work Level by putting the filter at the main query which is for the Cube like date = current_day. In my case i had a day_offset column in my date dimension which has the value as 0 for current day and if i want a previous day then i just have to give -1 , that is how i have implemented the incremental cube .
Let me know .

cognostechie

That is correct but considering the number of records (230,015) the cube build time for a full refresh should not be more than 5-15 minutes depending on the number of dimensions/measures unless it is designed poorly. I just made a cube last month which reads 2.4 million from one fact table and 2 million from another and the cube gets refreshed in 2-3 minutes. It has 6 dimensions and 17 measures.

cognostechie

To add the pros an cons, the pros are it will save the time to build the cube but the con is that the dimension values do not get updated so every once in a while you will have to do a full re-build. You will have the chances of the users detecting the missing dimension values before you detect it.

Francis aka khayman

just wondering:

1. i do full cube refresh
2. but i will filter my fact table data source to take only new records (date = today)

previous day data is still in the cube and today's data will be added, correct?
this is similar to incremental load as well?