If you are unable to create a new account, please email support@bspsoftware.com

 

News:

MetaManager - Administrative Tools for IBM Cognos
Pricing starting at $2,100
Download Now    Learn More

Main Menu

Dimension Nulls and Non Aggregating facts

Started by Lee Drake, 18 May 2009 11:21:50 AM

Previous topic - Next topic

Lee Drake

I am relatively inexperienced in Framework Modeler.  I am trying to create a model and have encountered something different.  The database is Oracle and is a star schema.

I created a report to test my model.  In the report, I am selecting data from 2 dimension tables and 2 fact tables.  For illustration purposes I will call the dimensions Customer and Product and the facts Sales and Expenses.  To most of my tests it worked fine.  Under one scenario, I get more rows than I expect.  Product has fields that can contain nulls; example Group Code.  Expenses has a field that is not aggregatable; Line Number.  If I put Group Code and Line Number in the same report, I get additional rows that are mostly nulls.  If I put Group Code in a report, it works fine.  If I put Line Number in a report, it works fine.  Only when they are together in a report is there any problem.

I do not think I can ask to have all the null columns filled in with placeholder values.  I also have several fact tables with non-aggregateable columns (sales index, unit price on day of sale, ROI), so creating new dimension tables would be difficult. (Our schedule is very tight.)

Am I missing something obvious?

blom0344

You have 2 facts, so by nature 2 starschema's. The whole balgame is whether the model is rightly build to support multifact queries.
My first tip would be to check the SQL generated. My 2 cents on seeing a full outer join where you get more rows than expected. Actually it is either that or a cross-join.

Second tip is to make sure that these are conformed dimensions (i.e. fact1-dim-fact2) and that Cognos produces the right stitch

Lee Drake

Correct, I have 2 fact tables, and have them in individual star schemas.  I believe I have everything correctly designed, but I get extra rows in certain instances.  So I must be missing something. 

I pulled the SQL out of the test report and copied it into TOAD.  As I expected, it selects 2 sets of data and then selects the required fields from the results.  I ran each of the parts seperately and got the correct number of rows for each part.  However the whole SQL brings back extra rows.  I beleive that it involves having a non-aggegate fact from one query and null values from the dimension that is common to both queries.

blom0344


Lee Drake

I simplifed the report as much as possible and changed all the names to protect the innocent (me).  Any SQL errors are a result of my clean up.  D2 and D3 both return 2 rows, which is correct.  The total query returns 3 row, which is incorrect.

select (coalesce("D2"."Material_Number", "D3"."Material_Number")) "Material_Number", "D3"."Expenditure_ID" "Expenditure_ID", "D2"."QUANTITY" "QUANTITY", (coalesce("D2"."PRODUCT_ATTRIBUTE", "D3"."PRODUCT_ATTRIBUTE")) "PRODUCT_ATTRIBUTE"
from
(
select distinct "D_PRODUCT_DIM_V"."MATERIAL_C" "Material_Number", "EXPENSE_V"."EXPENDITURE_ID" "Expenditure_ID", "D_PRODUCT_DIM_V"."PRODUCT_ATTRIBUTE" "PRODUCT_ATTRIBUTE"
from "D_PRODUCT_DIM_V" "D_PRODUCT_DIM_V","EXPENSE_V" "EXPENSE_V","EXPENSE_V_DIM" "EXPENSE_V_DIM","D_TIME_DIM_V" "D_TIME_DIM_V","D_CUSTOMER_L5_DIM_V" "D_CUSTOMER_L5_DIM_V"
where "EXPENSE_V_DIM"."ACTIVITY_ID"='T9001902' and "D_TIME_DIM_V"."YEAR_N"=2009 and "D_PRODUCT_DIM_V"."PRODUCT_LEVEL_ID"="EXPENSE_V"."PRODUCT_LEVEL_ID" and "D_CUSTOMER_L5_DIM_V"."CUSTOMER_ID"="EXPENSE_V"."L5_CUSTOMER_ID" and "EXPENSE_V_DIM"."EVENT_ID"="EXPENSE_V"."EVENT_ID" and "D_TIME_DIM_V"."TIME_ID"="EXPENSE_V_DIM"."FISCAL_YEAR_C" )
"D3"
FULL OUTER JOIN
(
select "D_PRODUCT_DIM_V"."MATERIAL_C" "Material_Number", "D_PRODUCT_DIM_V"."PRODUCT_ATTRIBUTE" "PRODUCT_ATTRIBUTE", sum("SALES"."QUANTITY") "QUANTITY"
from "D_PRODUCT_DIM_V" "D_PRODUCT_DIM_V","SALES" "SALES","EXPENSE_V_DIM" "EXPENSE_V_DIM","D_TIME_DIM_V" "D_TIME_DIM_V","D_CUSTOMER_L5_DIM_V" "D_CUSTOMER_L5_DIM_V"
where "EXPENSE_V_DIM"."ACTIVITY_ID"='T9001902' and "EXPENSE_V_DIM"."EVENT_STATUS_C" in ('CO', 'CL', 'PL') and "D_TIME_DIM_V"."YEAR_N"=2009 and "D_PRODUCT_DIM_V"."PRODUCT_LEVEL_ID"="SALES"."PRODUCT_LEVEL_ID" and "D_CUSTOMER_L5_DIM_V"."CUSTOMER_ID"="SALES"."L5_CUSTOMER_ID" and "EXPENSE_V_DIM"."EVENT_ID"="SALES"."EVENT_ID" and "D_TIME_DIM_V"."TIME_ID"="EXPENSE_V_DIM"."FISCAL_YEAR_C"
group by "D_PRODUCT_DIM_V"."PRODUCT_ATTRIBUTE", "D_PRODUCT_DIM_V"."MATERIAL_C")
"D2"
on "D3"."Material_Number"="D2"."Material_Number" and "D3"."PRODUCT_ATTRIBUTE"="D2"."PRODUCT_ATTRIBUTE"

Lee Drake

This query works fine if I use a measure other than Expenditure_ID or if PRODUCT_ATTRIBUTE is not null.

kasgang

Hello

Did you try out with using determinants ?


Lee Drake

I am trying to use deterimants.  I am not sure I am doing it correctly.  In the import layer, I have all the query subjects set up to use the primary key fields as a unique determinate.