If you are unable to create a new account, please email support@bspsoftware.com

 

News:

MetaManager - Administrative Tools for IBM Cognos
Pricing starting at $2,100
Download Now    Learn More

Main Menu

Poor performance on crosstab report

Started by sascha, 02 Nov 2010 09:37:05 AM

Previous topic - Next topic

sascha

Hi all

I am working with a DMR model on BI 8.4. and just set up a new crosstab report. The underlaying datasource (SQL2005) is extremely small; it only has a 200 rows fact table that will get much more once productive data is loaded. The fact table as well as the dimension tables are all indexed and properly joined by identifies in FM.

When running the report with a grouped list it just takes less than a second to show up. When getting the generated SQL and running it against the DB it also takes less than a sec. But when using the same query within a crosstab the generation of the reports takes > 5 minutes.

I played around with aggregation options but cannot figure out why it takes so long to run the crosstab. I have two other packages showing this long running times with crosstabs.

Does anyone have some ideas?

Thanks for your help,
Sascha

rockytopmark

Start in the model.

1). Check your levels in the Regular Dimensions.  If the Level's business key is unique by itself, then set the Unique Level check-box.  This could help simplify the SQL statements.

2). There is a Wide Member Tree setting on the hierarchy object, that is set to False by default, which you can change to True if you have many members in the dimension.  I have not been able to quantify what the magic number ("many") of members is where changing this to True should benefit, but it is worth some experimenting on your end.

Then head to the report. 

3). Write a test report with both the fast rendering list and the slow rendering crosstab.  Using the Tools menu, Show Generated SQL option, examine the 2 queries produced.  Hopefully this will give some insight.  Paste the 2 SQLs here if you'd like us to give our $0.02

M

sascha

Hi rockytopmark

All the regular dimensions are already checked to unique where applicable.
I played around with the wide member tree setting which was new to me but that did not change anything in the performance of the crosstab.

The generated SQL for the crosstab looks quite straight forward:

select distinct "Maschine"."Neu_Bestand_Code" AS "Neu_Bestandkey", "Maschine"."Neu_Bestand" AS "Neu_Bestand0", "Maschine"."Maschine_ID" AS "Maschine0key", "Maschine"."Maschine" AS "Maschine1"
from (
select "dms_maschinen"."maschine" AS "Maschine", case "dms_maschinen"."neu_bestand" when 'n' then 'Neue Maschine' when 'b' then 'Bestandsmaschine' end  AS "Neu_Bestand", "dms_maschinen"."maschine_id" AS "Maschine_ID", "dms_maschinen"."neu_bestand" AS "Neu_Bestand_Code"
from "planning_factory_dev"."analyse"."dms_maschinen" "dms_maschinen") "Maschine"

select distinct "Werkzeug"."Neu_Bestand_Code" AS "Neu_Bestand1key", "Werkzeug"."Neu_Bestand" AS "Neu_Bestand2", "Werkzeug"."Werkzeug_Code" AS "Werkzeug0key", "Werkzeug"."Werkzeug_Code" AS "Werkzeug_Bezeichnung"
from (
select case "dms_werkzeuge"."neu_bestand" when 'n' then 'Neuwerkzeug' when 'b' then 'Bestandswerkzeug' end  AS "Neu_Bestand", "dms_werkzeuge"."pk_werkzeug" AS "Werkzeug_Code", "dms_werkzeuge"."neu_bestand" AS "Neu_Bestand_Code"
from "planning_factory_dev"."analyse"."dms_werkzeuge" "dms_werkzeuge") "Werkzeug"

select "Werkzeug"."neu_bestand" AS "Neu_Bestand1key", "Werkzeug"."pk_werkzeug" AS "Werkzeug0key", "Maschine"."neu_bestand" AS "Neu_Bestandkey", "Maschine"."maschine_id" AS "Maschine0key", "Produktionsgesellschaft"."idl_id" AS "Tochtergesellschaftkey", "Produktionsgesellschaft"."bezeichnung" AS "Bezeichnung", "Produktionswerk"."werk_id" AS "Produktionswerkkey", "Produktionswerk"."werk_name_de" AS "Werk0", sum("Kennzahlen_Profile_Extrusion"."gutproduktion_netto_m") AS "Gutproduktion_Netto__m_", min(case "Werkzeug"."neu_bestand" when 'n' then 'Neuwerkzeug' when 'b' then 'Bestandswerkzeug' end ) AS "rc", min(case "Maschine"."neu_bestand" when 'n' then 'Neue Maschine' when 'b' then 'Bestandsmaschine' end ) AS "rc11"
from "planning_factory_dev"."analyse"."dms_werkzeuge" "Werkzeug", "planning_factory_dev"."analyse"."dms_maschinen" "Maschine", "planning_factory_dev"."analyse"."dms_tg" "Produktionsgesellschaft", "planning_factory_dev"."analyse"."dms_werke" "Produktionswerk", "planning_factory_dev"."analyse"."dmf_prp1" "Kennzahlen_Profile_Extrusion"
where "Produktionswerk"."prod_jn" = 1 and "Werkzeug"."pk_werkzeug" = "Kennzahlen_Profile_Extrusion"."fk_werkzeug" and "Produktionswerk"."werk_id" = "Kennzahlen_Profile_Extrusion"."werk" and "Produktionsgesellschaft"."idl_id" = "Produktionswerk"."tg_id" and "Kennzahlen_Profile_Extrusion"."maschine_id" = "Maschine"."maschine_id"
group by "Werkzeug"."neu_bestand", "Werkzeug"."pk_werkzeug", "Maschine"."neu_bestand", "Maschine"."maschine_id", "Produktionsgesellschaft"."idl_id", "Produktionsgesellschaft"."bezeichnung", "Produktionswerk"."werk_id", "Produktionswerk"."werk_name_de"

rockytopmark

The generated SQL you posted contains 3 separate SQL statements (not desired), indicating there may be some unnecessary local processing being performed at runtime, to stich the data together.  Please post your List SQL...

I am guessing you have a fact joined to a fact in your model, or perhaps a required join that is missing.

sascha

I was also wondering about the three seperate queries but don't have an explaination for it.

In the FM model is not fact to fact relation as the whole model contains only one fact table. As far as I see I also have built relationships to all the dim tables.

The list SQL looks like this:
select distinct "Werkzeug"."Neu_Bestand_Code" AS "Neu_Bestandkey", "Werkzeug"."Neu_Bestand" AS "Neu_Bestand0", "Werkzeug"."Werkzeug_Code" AS "Werkzeug0key", "Werkzeug"."Werkzeug_Code" AS "Werkzeug_Bezeichnung"
from (
select case "dms_werkzeuge"."neu_bestand" when 'n' then 'Neuwerkzeug' when 'b' then 'Bestandswerkzeug' end  AS "Neu_Bestand", "dms_werkzeuge"."pk_werkzeug" AS "Werkzeug_Code", "dms_werkzeuge"."neu_bestand" AS "Neu_Bestand_Code"
from "planning_factory_dev"."analyse"."dms_werkzeuge" "dms_werkzeuge") "Werkzeug"

select "Werkzeug"."neu_bestand" AS "Neu_Bestandkey", "Werkzeug"."pk_werkzeug" AS "Werkzeug0key", "Produktionsgesellschaft"."idl_id" AS "Tochtergesellschaftkey", "Produktionsgesellschaft"."bezeichnung" AS "Bezeichnung", "Produktionswerk"."werk_id" AS "Produktionswerkkey", "Produktionswerk"."werk_name_de" AS "Werk0", sum("Kennzahlen_Profile_Extrusion"."gutproduktion_netto_m") AS "Gutproduktion_Netto__m_", min(case "Werkzeug"."neu_bestand" when 'n' then 'Neuwerkzeug' when 'b' then 'Bestandswerkzeug' end ) AS "rc"
from "planning_factory_dev"."analyse"."dms_werkzeuge" "Werkzeug", "planning_factory_dev"."analyse"."dms_tg" "Produktionsgesellschaft", "planning_factory_dev"."analyse"."dms_werke" "Produktionswerk", "planning_factory_dev"."analyse"."dmf_prp1" "Kennzahlen_Profile_Extrusion"
where "Produktionswerk"."prod_jn" = 1 and "Werkzeug"."pk_werkzeug" = "Kennzahlen_Profile_Extrusion"."fk_werkzeug" and "Produktionswerk"."werk_id" = "Kennzahlen_Profile_Extrusion"."werk" and "Produktionsgesellschaft"."idl_id" = "Produktionswerk"."tg_id"
group by "Werkzeug"."neu_bestand", "Werkzeug"."pk_werkzeug", "Produktionsgesellschaft"."idl_id", "Produktionsgesellschaft"."bezeichnung", "Produktionswerk"."werk_id", "Produktionswerk"."werk_name_de"


I am also wondering about the crosstab a bit more... in the example above I have nested two hierachies within the rows and another one in the columns. If I remove one of the row hierarchies the SQL is still seperated but at least the report does run (even if it takes a few seconds). I really dont see any reason why two nested dims take so long.

Do you have any more ideas?

rockytopmark

The multiple selects bother me frankly... Can you confirm your model's Data Source is set to allow Local Processing, as-is the report?  If so, I would suggest changing these back to Database only and see if your report will function as desired.  If not, you may get a good clue as to what might be the problem.  It would look to me like this report SHOULD work in database only mode.  My rule of thumb... I only switch on the Local Processing option as a... no THE last resort.

Also, I did notice that the crosstab SQL (bottom of the 3) has MIN()s on a couple Case statements.  Make sure to set those aggregations to None.... might help a little.

CognosPaul

I noticee that one of the subqueries contains a case. Did you put that case in the physical layer or the logical layer? (Or data layer and business layer... I can never remember the correct terminology.)

If you have a calculation in the physical layer it'll force Cognos to treat that as a view, thus giving you the select t1.whatever from (select whatever, whatever2, whatever3 from table) t1

Also, any joins in the logical layer (or business layer) may create the behaviour you're seeing.

sascha

I got it!  :)

The physical data store was set to local processing. When setting it to database only I always got an error when running any report.

In the end I found the problem in the 'Member Sort' options of the regular dimension. 'Metadata' as well as 'Data' were checked. With unchecking 'Data' the query processing on 'Database only' did work. With this settings the report studio generated SQL returned only one single statement and the processing time of the crosstab was fast again but I wasn't able to browse the members in the tree view (error).

Now I have set the 'Member Sort' to 'Metadata' and the query processing to 'Local Limited'. This will let you browse the member tree in report studio while still having one fast query for the dataprocessing.

I guess that the 'Sort Order' for 'Data' can only be sent directly to the underlaying database if dealing with a real OLAP database. In DMR this has to be done on the report server which results in the long running times.

Thanks for you help!
Sascha




blom0344

Hi Sascha,

Great you solved the problem. I am very interested in your experiences regarding DMR since we are developing multiple complex models to replace Powercubes. What is the ultimate size of your facttables? We have seen some dramatic performance increases (40 fold decrease of logical reads; 20 fold decrease in CPU time) by adding additional indices on facttables. We also had to reorganize dimensional objects to allow for multifact analysis. These are matters unknown previously with Powercubes..

We have performed Stresstest with a DMR model against a single fact containing just under a million records. Performance was noticably slower than the same data from a Powercube, but still acceptable for analysis

sascha

Hi blom0344,

I am sorry that I cannot give you an answer to your question about the ultimate size as I don't have much experience with huge fact tables.

For sure indexing will increase performance. I assume that a million records should be of relative good performance depending on how the model is organized. The key thing I got out of this problem is that when using DMR on relational tables the query needs to be pushed to the DB and the performance should be quite the same (a little bit worse) than directly on the DB.

You might check for some functionality on the DB side which can temporarily store/cache some query results for better performance such as DB2 cubing services.

But still the question about sizes of fact table is interesting. Are the any other experience from other user that want to share their performance experiences?

Cheers,
Sascha

blom0344

But surely you must have some idea about the size of the ultimate production database? Running test/ developing with 200 rows facttables is not IMHO a very good approach. It is vital to perform some sort of performance test before you finish your development. Otherwise you may end up with an otherwise perfectly developed model that does not perform well enough for production..

sascha

Don't worry. The model is set up on the publish output of a Cognos Planning cube where the size is limited to the number of cells in the planning model  ;) The overall size will definitely fit the requirements...