Print Page - opinion: techniques for optimizing performance of large data sets

Title: opinion: techniques for optimizing performance of large data sets
Post by: patakya on 04 May 2010 12:07:22 PM

hi,

Looking for opinions on this subject.

My issue:
Our volume of data is growing over the years, and performance is only going to get worse.

QUESTION:
What techniques would people recommend for Database design (and then FM design) to maximize performance in Cognos 8 BI (relational data sets)?
note: This is IN ADDITION to any indexing or sql hints...

My current thoughts:
1. Create separate tables for historical and "current" reporting. This would keep current (say, last 24 months) performance stable.
However, how would you handle dimensional maintenance then? Would you just push fact data tables to current v. history tables?

2. Create 2 databases - one with current, one with "older" data. Then, copy your FM model in its entirety and simply point 2nd copy at the "history" database.
This would simplify FM maintenance to some degree. However, how would you do time trending against current + History then?

Title: Re: opinion: techniques for optimizing performance of large data sets
Post by: scull on 04 May 2010 03:11:52 PM

Hi,

If you can afford to loose details (granularity) you could create aggregate tables. For example sum up the daily facts to month total = about 30x less rows already.

Afterwards this new table could be added next to the current detail facts in FM and connected to other tables.

//M.
numberstoinfo.blogspot.com (http://numberstoinfo.blogspot.com)

Title: Re: opinion: techniques for optimizing performance of large data sets
Post by: patakya on 05 May 2010 12:04:05 PM

hi,

thanks Scull. That thought is in alignment with my option #1 (create separate tables). Still a followup for you (or anyone) though:

In a Dimensional/Star Schema-based DW, what aggregate tables are you creating? I assume you mean the "fact" tables. If so, do you leave the full detail in the dimensional tables? (So, instead of creating new dimensional tables for your older data, having a subset of dimensional data in them)

Title: Re: opinion: techniques for optimizing performance of large data sets
Post by: blom0344 on 05 May 2010 12:12:17 PM

If you can distinguish between 'new' and 'old' data, then partitioning may be the best approach. With 10 years of data, partitioning on year would mean having only 10% of all data to accessed for a given year. This is entirely based on RDBMS technology, nothing to do with Cognos..

Title: Re: opinion: techniques for optimizing performance of large data sets
Post by: patakya on 05 May 2010 01:48:12 PM

hi

thanks blom0344. Yes, I was waiting for the "P" word. I agree that should be part of the solution. While it won't help in reporting 10 yr time trends, it will help in cases where you are reporting on specific older time slices.

cheers

Title: Re: opinion: techniques for optimizing performance of large data sets
Post by: MFGF on 06 May 2010 03:17:03 AM

Quote from: patakya on 05 May 2010 12:04:05 PM
In a Dimensional/Star Schema-based DW, what aggregate tables are you creating? I assume you mean the "fact" tables. If so, do you leave the full detail in the dimensional tables? (So, instead of creating new dimensional tables for your older data, having a subset of dimensional data in them)

Normal practice would be to create aggregated fact tables (or materialized views/indexed views depending on the database, rather than physical aggregate tables). One of the key deciding factors will be whether the database you are using has a query optimizer which is aggregate-aware, and will re-write queries to point them to your aggregate tables/materialized views where this would give better performance. Typically the dimension tables are nowhere near as large as the fact tables, and do not usually require modification (so long as they already contain the necessary keys required to join to the aggregate values in the fact tables).

Regards,

MF.

Title: Re: opinion: techniques for optimizing performance of large data sets
Post by: patakya on 07 May 2010 02:15:33 PM

Quote from: MFGF on 06 May 2010 03:17:03 AM
Normal practice would be to create aggregated fact tables (or materialized views/indexed views depending on the database, rather than physical aggregate tables). One of the key deciding factors will be whether the database you are using has a query optimizer which is aggregate-aware, and will re-write queries to point them to your aggregate tables/materialized views where this would give better performance. Typically the dimension tables are nowhere near as large as the fact tables, and do not usually require modification (so long as they already contain the necessary keys required to join to the aggregate values in the fact tables).

Regards,

MF.

hi
We're on oracle 9i (soon to be 10g). not sure how well the Oracle Query optimizer will work with a Cognos front end. However, I did read up on Oracle's optimizer and they have a technique available in 9i they call "Query Rewrite with Materialized Views" which looks analogous. I will try that. thx.

COGNOiSe.com - The IBM Cognos Community

IBM Cognos 8 Platform => COGNOS 8 => Framework Manager => Topic started by: patakya on 04 May 2010 12:07:22 PM