If you are unable to create a new account, please email support@bspsoftware.com

 

News:

MetaManager - Administrative Tools for IBM Cognos
Pricing starting at $2,100
Download Now    Learn More

Main Menu

Determinants on Fact Tables

Started by wyconian, 14 Nov 2010 02:48:53 AM

Previous topic - Next topic

wyconian

Hey everyone.

Hope someone out there in Cognoise land can help me with this. 

I'm having a running 'heated discussion' with one of my technical assurance guys about adding determinants to facts and bridges. 

At the moment we only have single layer hierarchies in the dim tables and all the fact tables join at the lowest grain.  So, I don't really see any need to add determinants to the dimension tables but have given up on the 'discussion' as I don't think it will do any harm.

My TA guy is insisting that we add determinants to the fact tables as well.  He only wants unique determinants (not grouped by) but it doesn't make any sense to me add determinants to a fact and I think it would be detrimental to the SQL performance.

He is now asking for documentation from Cognos specifying that determinants shouldn't be added to facts.  I've checked the Cognos website and found the determinant stuff, it talks about using determinants to specify aggregation against multi-level hierarchies and query subjects that join different grains but doesn't actually specify usage on dimensions, facts or bridges.  He's a bit of an obstinate git and is probably not going to accept this.

So (finally) the question is am I right in this?

1) determinants shouldn't be added to dimensions unless there is a need to join facts at different grains
2) determinants should not be added to facts, unless you've got fact to fact joins and need to roll one fact against different levels in the other fact (obviously fact to fact joins are generally a bad idea and should be avoided)
3) determinants on bridge tables don't make any sense as the bridge doesn't join to a fact table

Does anyone know of any 'official' documentation that supports this?  Does anyone have an argument for adding determinants to dimensions as a default even when they are just flat hierarchies and the facts always join at the lowest grain?

I've searched the forums, although there is a lot of good stuff about determinants I can't find anything specifically related to different table types.

As always any help appreciated.

Wyconian

blom0344

For sure I am no determinant expert, but in my experience there is no use for determinants on facts if the grain is the same for all facts. To backup this opinion I took one of my multifact DMR models and checked on generated SQL in the following cases:

(1) Determinant (unique) on dimensional object only
(2) Determinants (unique) on all objects involved (basically using set of FK's for the fact)
(3) No determinants whatsoever

In my case the SQL generated is the same for all 3 scenario's

Cognos detects 2 facts converge on the same shared dimension through cardinalities set (not through determinants), splits into 2 sets for aggregation and recombines through coalesce/full outer join over the query items involved (those with no aggregate set)

Adding more shared dimensions does not change the way SQL is generated.
Fact -to- fact is altogether a different matter, without determinants I would expect to get improper results in any case. As for official documentation that says 'the word' ? No idea.. I hope other developers may join this discussion , so we may perhaps generate the ultimate 'determinant' discussion.

Just a last word from the official Guide by Dan Volitich:

Quote
The main purpose of determinants in a query subject is to specify the granularity at which other related query subject might be joined

wyconian

Thanks Blom

I've quoted Dan and the Cognos Training manual and documentation from the cognos website and details from a cognos training partner. 

My TA came back with 'but it doesn't say not to use them on facts'.  Guess I'll just keep picking away at this one.

By the way you don't know of any English speaking cognos developer jobs going in Netherlands or Belgium do you?  I'm starting to feel like a change from London for a while.

blom0344

#3
I really doubt whether you will find an explicite answer to your question. There are just too many permutations possible in terms of models and type of models..

I think your TA should reconsider. One ADDs determinants to a model (for a certain reason), they are not there from the start. In the latter case he would have a point. Why remove them when no explicit rule exists for removal in certain cases.. 

Show your TA the SQL generated, cause by and large the framework is mostly about generating SQL. If no other SQL is generated, then adding is a sure violation of Occam's razor  ;)

Want to move to the Netherlands? Perhaps my company could use fresh Cognos input   ;D

Lynn

My understanding and experience with this is in line with both of you guys. I wonder why he feels so strongly about keeping them on the facts? It seems you've provided information on where and when they are needed, but has he provided any explanation or reasons why he feels they ought to be there?

They may be imported by default on the data source query subjects, but so is the data type of query items and that does not always result in correct usage property settings. That is why we review defaults and adjust things like a numeric column that is assumed to be a fact when it really ought to be treated as an attribute. A silly example, I realize, but just trying to make a point about default behavior.

Ultimately Blom is wise (as usual) about evaluating the generated SQL with and without them. If there is no difference for various scenarios then checking if there is a material difference in performance one way or the other would be something to look at. Determinants would all be evaluated, as the snippet from the FM guide below states, but that may or may not translate to anything worth writing home about. Simplicity is indeed desirable.

QuoteDeterminants are processed in the order in which they are specified in the model. You can change
the order of the determinants. If a query subject has more than one determinant, the first one that
covers all the requested items is used. Determinants are evaluated in the context of each required
join as well as the context of requested items.

Data source query subjects are imported with determinants defined for them. These default
determinants are generated based on keys and indexes in the data source.

blom0344

Thanks Lynn for adding the quotation. I overlooked the automatic generation of determinants when importing data source objects. This is indeed true for primary keys (and indexes ?) , but this still leaves us with true facts that have only sets of FK's. As stated earlier it is still very much matters how your model looks like and where need to model against..

MFGF

#6
Very valuable insight and advice Lynn and Ties!  The only (probably useless) piece I can add is that determinants are used during runtime query generation to drive the grouping level(s).  Given that your measures are not grouped but are aggregated in line with the grouping of the query, you should in theory not need determinants on measures or their FKs.

Regards,

MF.
Meep!

cognostechie

#7
Quote from: wyconian on 14 Nov 2010 02:48:53 AM
Does anyone know of any 'official' documentation that supports this?  Does anyone have an argument for adding determinants to dimensions as a default even when they are just flat hierarchies and the facts always join at the lowest grain?
Wyconian

I attended the FM class from Cognos few years ago so I still have the book that they give out in the class. As far as I remember and also the book says that determinants should be added if required. It is clearly not a mandatory requirement to add determinants either to a Fact or to a Dimension Query Subject.

blom0344

Not to milk this topic, but I searched for the fasttrack manuals and specifically the part about determinants.

I quote:

Quote
Determinants are a feature of Cognos 8 and designed to provide control over granularity. The settings specified in the determinants dialog give Cognos 8 enough information to correctly aggregate facts in queries with multiple levels of granularity

Explicitly!!  =  multiple levels of granularity


A little further down the line there is a very mysterious one:

Quote
To leverage indexes with DMR (especially when filtering on captions) having your captions associated with the correct key in a determinant will improve query generation

A kewpie doll for anyone shedding light on this one

great documentation really if you are into riddles

MFGF

Quote from: blom0344 on 18 Nov 2010 03:52:55 AMA little further down the line there is a very mysterious one:

A kewpie doll for anyone shedding light on this one

great documentation really if you are into riddles

Hi Ties,

I think this is alluding to a situation where you have a dimension which provides alternative unique identifiers for a particular level.  For example, if you have a Time Dimension with Year, Quarter, Month and Day levels, you may have two alternatives for defining the BusinessKey attribute on (say) the Month level - one alternative would be to use the single MonthKey column (with values such as 201011), and the other alternative would be to use the Year (2010) and Month (11) columns combined together as the BusinessKey attribute.  If MonthKey was defined as an index in the underlying table and Month was not defined in an index, obviously the most appropriate grouping (to utilise the index) would be based on MonthKey.  If your determinant was then (correctly) based on MonthKey, defining the DMR BusinessKey as Monthkey would associate the level with the appropriate indexed database column, whereas defining the DMR BusinessKey as a combination of Year and Month would not.

That's my interpretation of what it's saying, anyway, but I didn't write the documentation, and it's entirely possible I'm entirely wrong about its meaning! :)

MF.
Meep!

Lynn

I say give the muppet a kewpie doll
;D

MFGF

Quote from: Lynn on 18 Nov 2010 07:01:45 AM
I say give the muppet a kewpie doll
;D

Thanks!  I always wanted a doll with a Mohican - my life's ambition is fulfilled! :)
Meep!

Lynn

You must already have a troll doll then. If not, your ambitions are too low.

http://en.wikipedia.org/wiki/Troll_doll

CognosPaul

#13
I'll admit that data modelling is not my strong point (this will be obvious in a moment), but for my own edification.

An example scenario.

SalesOrderHeader
----------------
SalesOrderID  (PK) (index)
DateKey        (FK)
Subtotal        (Fact)
Tax              (Fact)
Shipping        (Fact)

SalesOrderDetail
----------------
SalesOrderID  (FK)
DetailID         (PK)
ProductID      (FK)
Quantity        (Fact)
Cost             (Fact)

Wouldn't a determinant be set on SalesOrderHeader.SalesOrderID?
The list would (should?) look like
OrderID | Product | Tax | Quantity
--------+--------+-----+--------
0000001| P1        |  55  |   5
0000001| P2        |  55  |   9
-----------------------------
Total                  |  55  | 14

as opposed to:
OrderID | Product | Tax | Quantity
--------+--------+-----+--------
0000001| P1        |  55  |   5
0000001| P2        |  55  |   9
-----------------------------
Total                  | 110  | 14

If you were making a DMR on this I would assume you'd also set SalesOrderHeader.SalesOrderID as the businessKey (as opposed to SalesOrderDetail.SalesOrderID) to force it to obey the index.


Lynn

I hack around at things till I get what I want so I never know if my first approach is a good one or not until I try...

That said, I'd consider that scenario as two fact tables and I wouldn't join them to each another but only to the dimensions they conform to (I'd need date key denormalized on the detail table to do it).

Although tax and shipping are not at the product level of detail, I should see just what you describe if I brought them into a query that also included product. This makes it about reporting with non-conformed dimensions rather than a difference in granularity on a common dimension. The granularity is identical for these two facts within the conformed dimension (date).

So I still wouldn't have determinants on the fact tables. I don't think I'd even need determinants on the dimensions -- in this case the only conformed dimension is date and the granularity is the same for both fact tables. I'm not really thinking of sales order as a dimension unless it is a degenerate dimension.

If I join the two tables together then I'm mixing grain on a single fact table because some facts pertain to product (like quantity and cost) while other facts do not pertain to product (tax and shipping). I think Mr. Kimball says this is a no-no.

blom0344

Paul,

I hacked into an existing model and build 2 identical sets. First set has determinants on both imported tables, the second set has none.

In my case I have real estate properties in place of the order header and rentable units as sales details..

(1) generated output is identical
(2) query SQL is identical

Cognos detects a fact at the 1 side of the cardinality and decides to first perform the aggregate on the n side and then join with the non-aggregated set from the 1 side:

Thus, no aggregate is used for the fact on the 1 side. In other words , it treats the fact from the 1 side as a attribute, but in the report is correctly shows it's total.

example: (native SQL)

select T1.C0 AS C0, T1.C1 AS C1, T1.C2 AS C2, T0.C1 AS C3
from
(
select Prop.Name AS C0, sum(RU.NFA) AS C1
from Datasource.Property Prop, Datasource.RU RU
where Prop.Id = RU.Property and Prop.Id in (2228225, 8912897)
group by Prop.Name) T0,
(
select Prop.Name AS C0, Prop.Id AS C1, Prop.Price AS C2
from Datasource.Property Prop, Datasource.RU RU
where Prop.Id = RU.Property and Prop.Id in (2228225, 8912897)) T1
where T1.C0 = T0.C0

blom0344

Quote from: blom0344 on 18 Nov 2010 02:54:39 PM
Paul,

I hacked into an existing model and build 2 identical sets. First set has determinants on both imported tables, the second set has none.

In my case I have real estate properties in place of the order header and rentable units as sales details..

(1) generated output is identical
(2) query SQL is identical

Cognos detects a fact at the 1 side of the cardinality and decides to first perform the aggregate on the n side and then join with the non-aggregated set from the 1 side:

Thus, no aggregate is used for the fact on the 1 side. In other words , it treats the fact from the 1 side as a attribute, but in the report is correctly shows it's total.

example: (native SQL)

select T1.C0 AS C0, T1.C1 AS C1, T1.C2 AS C2, T0.C1 AS C3
from
(
select Prop.Name AS C0, sum(RU.NFA) AS C1
from Datasource.Property Prop, Datasource.RU RU
where Prop.Id = RU.Property and Prop.Id in (2228225, 8912897)
group by Prop.Name) T0,
(
select Prop.Name AS C0, Prop.Id AS C1, Prop.Price AS C2
from Datasource.Property Prop, Datasource.RU RU
where Prop.Id = RU.Property and Prop.Id in (2228225, 8912897)) T1
where T1.C0 = T0.C0



Now, and this is VERY interesting  ;D  :

I remove joins/determinants from the datalayer and add the joins/determinants in the transformation layer (between the model subjects).

After republishing the model WITH the determinants gives the same - expected -  results, but the one without them gives the typical inflated ones.

So , basically it also matters where they are assigned..

cognostechie

Any joins in the Model Query subjects will get a higher priority over the joins Database Query Subjects. Determinants in the Model Query subjects would probably get forced though I am not sure at this point. If you execute the Model Query subject as a 'View', it would ignore the joins of the Database Layer. FM treats Query Subjects as Dimension and Facts by the cardinality regardless of whether they are Dimensions or Facts in our eyes.

Ex: Customer <--> Order Header <--> Order Line

You make a report using Customer and Order Header, Cognos will treat Customer as the DIM and Order Header as the FACT. Make a report with Order Header and Order Line and Order Header will be treated as a DIM and Order Line will be treated as a FACT because the join would 1 on the Order Header side and n on the Line side. 

Determinants are used only to resolve multiple levels of granuality.

Ex: Date Dimension has year, Month and Day.  Transaction table could be joined to the date Dimension by either Day or the Month. So there should be determinant that will include Year, Month and Day and Day key to be unique and probably group by Months.

In case of Sales Order Header and Sales Order Line, I would not use any determinants. If the report shows  Tax to be 110, just group that column in the report with the Order ID or Order Number, even the totals after grouping would show correct instead of duplicating.

blom0344

Not with you all the way    :-\

A metadata model is basically a way to generate 'smart' SQL. In the case of SALES ORDER <--> SALES DETAIL ( or in my case with objects <--> units) having facts on both sides induces Cognos to handle 2 sets first , then join  (check the example I gave) returning correct results.

With joins applied in the import layer this happens with both determinants set and not set.

BUT..

Move joins and determinants to the modelling layer and determinants seem to make the difference. In my case , without determinants incorrect (bloated) results are returned

The reason why I ramble on about this, is that most our models work with joins in the modelling layer, cause we use custom SQL objects in the datalayer.

Initially, I had no reason to choose between generating additional database views or defining cognos SQL objects. However, now pretty much convinced we need to move logic into the database to produce leaner Cognos models that generate (minimized SQL!!) more compact , easier to understand SQL..

MFGF

Quote from: PaulM on 18 Nov 2010 08:59:14 AM
I'll admit that data modelling is not my strong point (this will be obvious in a moment), but for my own edification.

An example scenario.

SalesOrderHeader
----------------
SalesOrderID  (PK) (index)
DateKey        (FK)
Subtotal        (Fact)
Tax              (Fact)
Shipping        (Fact)

SalesOrderDetail
----------------
SalesOrderID  (FK)
DetailID         (PK)
ProductID      (FK)
Quantity        (Fact)
Cost             (Fact)

Wouldn't a determinant be set on SalesOrderHeader.SalesOrderID?
The list would (should?) look like
OrderID | Product | Tax | Quantity
--------+--------+-----+--------
0000001| P1        |  55  |   5
0000001| P2        |  55  |   9
-----------------------------
Total                  |  55  | 14

as opposed to:
OrderID | Product | Tax | Quantity
--------+--------+-----+--------
0000001| P1        |  55  |   5
0000001| P2        |  55  |   9
-----------------------------
Total                  | 110  | 14

If you were making a DMR on this I would assume you'd also set SalesOrderHeader.SalesOrderID as the businessKey (as opposed to SalesOrderDetail.SalesOrderID) to force it to obey the index.



I'm with Lynn on this one - I would treat SalesOrderHeader and SalesOrderDetail as two separate fact query subjects, each linking to a conformed dimension with two levels - SalesOrder and Detail (with determinants defined for each level of the conformed dimension).

Regards,

MF.
Meep!

blom0344

Quote from: MFGF on 19 Nov 2010 03:38:59 AM
I'm with Lynn on this one - I would treat SalesOrderHeader and SalesOrderDetail as two separate fact query subjects, each linking to a conformed dimension with two levels - SalesOrder and Detail (with determinants defined for each level of the conformed dimension).

Regards,

MF.

Effectively moving all non-measures to the conformed dimension. This will give you the query split and restitching (full outer join/coalesce) usually described. I always assumed this was the only way, till I found out about a year ago that Cognos seems to handle it very well how Paul describes it.

Never knew what ot make of it. Couldn't label it as 'works as designed', cause I never found any reference. Till now..

cognostechie

Quote from: blom0344 on 19 Nov 2010 01:48:45 AM
The reason why I ramble on about this, is that most our models work with joins in the modelling layer, cause we use custom SQL objects in the datalayer.

I don't know why you have it like this..

The 1st strategy is to make joins in the Database layer and NOT use custom SQL. The SQL in the database layer should say 'Select * from ......'. This is according to the best practise by Cognos. Joins in the Model Layer should be made only to override joins from the database layer (if required). Basically, you create joins in the Model Query subjects only if the relationship defined in the database layer does NOT apply to a particular report. Hence the feature of MQS overriding the relationship defined in DQS. Same logic applies for determinants. You do that in MQS only when you wish to override the one on DQS for a particular report.

Determinants work with the way you set the 'keys'. For a Time Dimension where you have the Year, Month and Day, the Month determinant should have Year and Month as the key, not just the Month. In your case, you set the determinants in relation with 2 query subjects. Determinants are not forced if there is no need for it. Anything you do in MQS is a way to override DQS, that might be the reason it is working differently in model layer for you.

This is an ideal situation for determinants in the databse layer and let me know if this doesn't work. It works for me:

Time Dimension :  Columns - Year Key, Year Name, Month Key, Month Name, Day Key, Day Name
Sales Fact : Columns - Month, Sales
Orders Fact : Columns - Day, Quantity, Revenue

Set the determinnats in the Time dimension as follows:

Name     Key               Attributes        Unique/Non Unique             Group By
--------------------------------------------------------------------------

Year       Year Key       None                 No                                  Yes

Month     Year Key       Month Name       No                                   Yes
             Month Key

Day        Year Key       Day Name           Yes                                 No
             Month Key     Month Name
             Day Key        Year Name

When you make a report involving all 3 query subjects, it will give correct results without bloating
the sales figures.

Do NOT set the determinants in the databse layer and you will see the Sales figures inflated !

blom0344

#22
Quote from: cognostechie on 19 Nov 2010 11:47:38 AM
I don't know why you have it like this..


Reason 1: Inheritance

Some models (early versions) where there from the start (= my start) , with some of them consisting of just one level. (Actually not that strange if you've spend 10 years working with BO Universes)

Reason 2: Our quirky database structures

They are so far removed from traditional (best-practice) reporting models that even adding an ETL-layer will simply not net starschema models. so ,we need to improvise

I appreciate your example, but It is the standard multi-grain solution. we were focussing in the discussion on the same grain scenario's..

cognostechie

You are right. Sometimes I read everything in a hurry and miss the point.

Ok, I used the same scenario:

Order Header -

OrderID (PK)
Freight

Order Detail:

Order ID (FK)
Product ID
Quantity

I set the determinant on the Database Query subject and it works the way it worked for you but only on the List. The Crosstab still shows inflated total. I attached an image showing the List and Crosstab.

cognostechie

Now this is the same scenario blom had..

I removed the determinant from Database Layer where I jave joins and added the determinant in the Model Query subject where I do NOT have any joins and it didn't work. I got the inflated Freight !!

I create the determinants in the Database Layer and it works !! Makes sense since determinants are supposed to work with the joins. So looks like it will work wherever the joins are regardless of whether it is MQS or DQS.

It's working for List and Graphs but still does not work for Crosstabs. Did anybody try it with a Crosstab?