If you are unable to create a new account, please email support@bspsoftware.com

 

News:

MetaManager - Administrative Tools for IBM Cognos
Pricing starting at $2,100
Download Now    Learn More

Main Menu

Modelling Question

Started by rpvs, 21 Jun 2011 03:39:13 PM

Previous topic - Next topic

rpvs

Our data is divided by a database name and data for both dimensions and facts. For example

Account Dimension table has column1 as database name and column2 as account code

Sub Account Dimension table has column1 as database name and column2 as Sub account code

Invoice Fact table has column1 as database name and column2 as  Revenue

Plan fact table has column1 as database name and column2 as  Plan amount

Currently have modelled both facts to both dimensions linking database name and key .

Example : Invoice fact is joined by database name and acctkey to account table.

For a plan vs invoice report listing  account , subaccount, Revenue and plan amount 
i am having to filter both dimensions and facts by database name ( 4 filters for database name) to get the correct results.

What would be the correct way to model so that i can pick one database name and filter it once instead of having to filter it  for every dimension and every fact???

MFGF

The short answer is you shouldn't need to.  If your relationships are defined correctly, a single filter should be all you need.

Just as a quick sanity check, make sure that you have only one relationship defined between each fact and each dimension, based on the two columns, and not two separate relationships.  If you have the latter, only one of the relationships will be used, which may explain the behaviour you are describing.

Regards,

MF.
Meep!

rpvs

You are right i need not do that. I had other things with the data for plan fact which had multiple records for the same account code with a different plan code. Once i deleted the unnecessary plan data , i was able to get the results with just one filter.

Thanks for your response.

cognostechie

Keep in mind that with the join strategy you have, you will have Loop joins in the model (termed as Transitive Trap by Cognos). I know most people don't care for that becuase the reports are 'working' for the time being until they start seeing different results at different times for the same report.

MFGF

Quote from: cognostechie on 23 Jun 2011 12:08:16 PM
Keep in mind that with the join strategy you have, you will have Loop joins in the model (termed as Transitive Trap by Cognos). I know most people don't care for that becuase the reports are 'working' for the time being until they start seeing different results at different times for the same report.

This sounds to me like two star schemas using conformed dimensions, which I think should be perfectly ok. Any attempt to utilise both fact tables will result in a query split (stitch query) which will effectively break the loop anyway.  Good call though - loops can usually give issues, patricularly blind spots, so they should always be investigated and resolved as necessary.

Cheers!

MF.
Meep!

cognostechie

Quote from: MFGF on 24 Jun 2011 05:11:51 AM
This sounds to me like two star schemas using conformed dimensions, which I think should be perfectly ok. Any attempt to utilise both fact tables will result in a query split (stitch query) which will effectively break the loop anyway.  Good call though - loops can usually give issues, patricularly blind spots, so they should always be investigated and resolved as necessary.

Cheers!

MF.

It is not two Star schemas using a confirmed dimension. That only happens when there are multiple facts joined to each other thru a single dimension. The moment you have more than one dimension between multiple facts, it will result in a Loop join.

The OPs original post says that he has both the Facts joined to both the dimensions (though it is interesting that Sub Account is not a child of Accounts and is joined to the Facts directly). So it will be like this -

                  |------------ INVOICE ---------------|                 
                  |                                                 |
             ACCOUNT                                 SUB ACCOUNT
                  |                                                 |
                  |-----------  PLAN  -------------------|

See the circle ?

Lynn

Hmmmmm.....

If both dimensions are conformed to both facts, then it is multi-fact situation around conformed dimensions. Yes, it is a loop, but the cardinality identifies the facts (those with 1..n) and will handle it accordingly. This means a stitch query or else full outer join if the database supports it.

You can have more than one dimension conformed to multiple facts as far as I know. I would agree that account and sub-account sound more like a hierarchy of a single dimension rather than two different dimensions.


             1..n |------------ INVOICE ---------------| 1..n
                  |                                    |
                  |                                    |
             1..1 |                                    | 1..1
                  |                                    |
             ACCOUNT                                 SUB ACCOUNT
                  |                                    |
             1..1 |                                    | 1..1
                  |                                    |
                  |                                    |
             1..n |-----------  PLAN  -----------------| 1..n

RobsWalker68

#7
Hi,

I agree yes that technically it's a loop join, but a loop join that Cognos explicitly knows how to handle so not a modelling issue.

Note from Framework Manager Guide:

"Loop joins in the model are typically a source of unpredictable behavior. This does not include star schema loop joins.

Note: When cardinality clearly identifies facts and dimensions, IBM Cognos 8 can automatically resolve loop joins that are caused by star schema data when you have multiple fact tables joined to a common set of dimension tables."

Rgds

Rob


cognostechie

Cardinality only identifies which will be treated as a Fact and which will be treated as a Dimension. It has nothing to do with handling loop joins.

Rob - You are almost there. It is the unpredictable  behaviour in generating the SQL due to which loop joins are supposed to be avoided. Reason - When cognos generates the SQL, sometimes it would say

where invoice.fieldname = account.fieldname and plan.fieldname = account.fieldname

at other time, the same report would generate an sql that would say

where invoice.fieldname = subaccount.fieldname and plan.fieldname = subaccount.fieldname

Would the data set be correct - probebly yes.
Is it considered a good practice to take the risk - NO

That's the point !

and by the way, this situation is NOT a Star Schema so that does not apply here.

RobsWalker68

#9
Hi,

Ok just to clarify the point so we are talking about the same thing. With regards to the joins and cardinality detailed by Lynn in her last post.  Would you consider that to be a loop join?

I'm also a bit confused why this isn't a star schema so perhaps I've misunderstood and you could clarify.  From the previous posts we appear to have two facts (Invoice, Plan) joined by two conformed dimensions (Account, Sub Account).  That is clearly a star schema so what am i missing?  And if it is as I've detailed above it cannot possibly be a loop join.

Confused !

Rgds

Rob

cognostechie

Yes, I would definately consider that to be a Loop join.

The preferred way would be to create one query subject containing Accounts and Sub Accounts and
then join that to both the Facts. That would resemble a Star Schema. If that is not possible, I would
delete the join between Sub Account and Plan Fact, create a shortcut of Sub Account and join Plan Fact
to that Shortcut. That's exactly what 'alias Shortcut' is meant for in FM.

So it would be like this:

Accounts  ---------  Invoice -----------  Sub Account   
    |
    |
Plan ------------ Sub Account

cognostechie

#11
To answer your question  about why it is not a Star Schema -

In a Star Schema, the attributes of a dimension are de-normalised so there should
have been one table containing Accounts and Sub-Accounts. That way there would have
been only one join between both the Facts which would have eliminated the Loop join
completely ! 

You are probably thinking that a Schema is considered a Star as long as joins are between Facts
and Dimensions. Well, the attributes also have to be collapsed to form a Dimension. A Dimension is
a collection of all attributes pertaining to that area. In this case there is no Dimension because Accounts
and Sub Accounts are seperate ! It is currently in a normalised form. By denormalising, we can create an Accounts Dimension and then join that to both the Facts.

A join is a Loop join when the multiple Facts go in a circle (coming back to itself).

RobsWalker68

Hi,

Well I will give you that is certainly an interesting and unique approach to handle it.  Why are you doing it that way though when if you set the correct cardinality Cognos will handle the loop join itself as it recognises it as a star schema?

Cognos will create a result set out of

1.Account, Invoice, Sub Account
2.Account, Plan, Sub Account

and then stich the two results together hence no loop join.

rgds

Rob

RobsWalker68

Hi,

Hopefully our posts won't cross this time.

The fact that Account and Sub Accounts are modelled seperately in two dimensions is just semantics.  We may agree that perhaps they should be in one dimension but as long as the fact tables have seperate surrogate keys for Account and Sub Account then its really not here nor there. 

Rgds

Rob

cognostechie

You are looking at it in a specific way. Modelling is not done keeping in mind what the report will contain.
You are presuming that the report will contain fields from all 4 query subjects.

What will happen if a user makes a report containing only 2 fields?

Account Nos, Sub Account Nos.

and I am not sure that Cognos will consider and handle it like a Star schema with these joins. Have you
tested it that way?

Modelling has to be done keeping in mind that it will work for all scenarios.

cognostechie

Forgot to add that there also has to be determinants set on dimensions when they are joined to multiple facts
otherwise the totals/sub-totals will get screwed up like we discussed in other posts. In this case, the determinants need to be set in both Accounts and Sub Accounts which will screw things up further.

In one of the posts for determinants, I have posted screen shots showing how accurately Cognos rolls up data
at all levels of granularity when determinants are set properly vs the wrong rollup when determinants were not set properly.

RobsWalker68

Quote from: cognostechie on 27 Jun 2011 05:49:10 PM
What will happen if a user makes a report containing only 2 fields?
Account Nos, Sub Account Nos.

It will create a cross join of the Account Dimension and Sub Account Dimension as neither fact table is involved, however that would be true of any star schema design and it would be the same as me writing SQL

Select AccNo, SubAccNo
From Account Dim, SubAccountDim

However if you are modelling facts even factless facts you are going to be using a measure from the fact table.

Modelling doesn't have to match all scenario's it has to match all realistic scenario's that would make sense with regards to the process you are modelling. Otherwise you would have to model scenario's like the user wants to combine the product dimension with the calendar dimension.

Rgds

Rob


RobsWalker68

Yes I agree determinants are important if you joining facts which have different levels of granularity


Rgds


Rob

cognostechie

#18
Quote from: RobsWalker68 on 27 Jun 2011 06:01:06 PM

Modelling doesn't have to match all scenario's it has to match all realistic scenario's that would make sense with regards to the process you are modelling. Otherwise you would have to model scenario's like the user wants to combine the product dimension with the calendar dimension.

Rgds

Rob

In that case, I have nothing more to say ! Your approach is the approach taken by common modellers who keep changing the model for every report as they go on !

and you obviously depend only the user guides so you are not aware of the implications of blind spots, loop joins etc. They say that very clearly in Cognos that Loop join sare a complete NO NO ! It is called a Transitive Trap and it is recommended to resolve that. There is a seperate chapter called 'Modelling for Predictable Results' in the class conducted by Cognos.

There is a general way to do something and there is a right way to do it ! I prefer the right way !

We have different ways of working ..

RobsWalker68

Hi,

What would lead you to draw that conclusion as really the approach I have outlined is  just the Kimball approach and with regards to framework design question the approach adopted by IBM. 

I have tended to find that following these approaches leads to pretty stable designs and to be honest I have never had to change my report designs as I go on.

Anyway interesting chat

Kind Regards

Rob 

RobsWalker68

Quote from: RobsWalker68 on 27 Jun 2011 06:18:44 PM
There is a seperate chapter called 'Modelling for Predictable Results' in the class conducted by Cognos. There should never be more than one dimension linked to multiple facts !

That would make a great Business Intelligence system !!

Lol, now I know that someone is trying to wind me up  ;D

Rob

cognostechie

#21
I would invite the moderators to comment on this. How many FM experts agree that Cognos does not has a chapter on 'Modelling for Predictable Results' ?

Also comment on what is a Loop join in Framework terminology? People here think that if all 4 query subjects are not involved in a single SQL then it's not a Loop join !!! That's NOT what a Loop join means in FM terminology. It is still a Loop join in FM no matter what kind of SQL it generates. The bottomline is NOT to have more than ONE way to reach from Point A to Point B.

If IBM adopts that approach (as Rob mentions), I would love to get some evidence of that ! I have documentary evidence to prove that IBM does NOT support that approach. It is mentioned very clearly in that chapter that I mentioned.

RobsWalker68

#22
I will happily concede the point about the course chapter as I over extended the quote, although it was 1am so hopefully you will forgive me for that one.

I was intending to call out the quote: "There should never be more than one dimension linked to multiple facts !" 

That is situation that doesn't apply where you have multiple fact tables linked through conformed dimensions.

As I'm a user guide kind of guy I've reproduced the full IBM quote for clarity.  This is found in the Framework Manager Guide -> Guidelines for Modelling Meta Data -> Building a Relational Model -> Loop Joins

"Loop joins in the model are typically a source of unpredictable behavior. This does not include star schema loop joins. Note: When cardinality clearly identifies facts and dimensions, IBM Cognos 8 can automatically resolve loop joins that are caused by star schema data when you have multiple fact tables joined to a common set of dimension tables."

Now there may be a point about merging the Account and Sub Account dimension into a single entity, however the originator of the post gave no information on what the pupose of the Account/Sub Account is.  Although the names make it sound like they should be merged we cannot infer that from the information given.  For all we know Account may relate to Customers and sub account to a Chart of Accounts - who knows, not us for sure!

I have no intention to mislead people and I respect your opinion cognostechie although I do believe that in this instance you have slightly misunderstood how Cognos handles reporting between multi facts. 

At the end of the day if you want to see how Cognos handles multi fact joins that look on paper as Loop joins run it up in Cognos using the sample pacakages.

Create a report with a couple of conformed dimensions of your choice say Product & Retailer and a couple of facts say Sales & Sales Forecast and look at the generated SQL.  I am confident that you will see a stiched query and not a loop join.  You need to make sure you have fact detection enabled in your model of course

Kind Regards

Rob     












MFGF

Quote from: cognostechie on 27 Jun 2011 11:03:21 PM
I would invite the moderators and the experts from BSP to comment on this. How many FM experts agree that Cognos does not has a chapter on 'Modelling for Predictable Results' ?

BSP --- Anyone from your team attended classes at Cognos ?? Come on in, comment on this..

Also comment on what is a Loop join in Framework terminology? People here think that if all 4 query subjects are not involved in a single SQL then it's not a Loop join !!! That's NOT what a Loop join means in FM terminology. It is still a Loop join in FM no matter what kind of SQL it generates. The bottomline is NOT to have more than ONE way to reach from Point A to Point B.

and if IBM adopts that approach (as Rob mentions), I would love to get some evidence of that ! I have documentary evidence to prove that IBM does NOT support that approach. It is mentioned very clearly in that chapter that I mentioned.

There is no need to mislead people on this forum, hence I am mentioning all this as misleading other people might result in wrong results on their side.


MFGF, PaulM, Blom, Reportnet Addict, BSP experts -- Come on in !!!!! Share your suggestions !!

Well, you asked so here's my opinion (for what it's worth! :) )

Firstly, I wouldn't class myself as an expert.  I do have some experience though, and I did spend over a decade teaching Cognos classes, so I know the official documented approaches quite well (and they have changed over the years too - as you might notice if you read some of my early posts from back when this forum first started! :) ). I don't think Rob was saying that IBM/Cognos do not have an approach and documented guidelines to modelling for predictable results - simply that their guidelines do not imply that multiple facts should never link to more than one dimension.

Being purely pedantic, a transitive relationship trap (loop) is a situation where the relationships in the model result in there being more than one possible way of linking two query subjects. In a foundation layer when dealing with structures typically derived from a normalized transactional database, these can lead to unpredictable query results and blind spots in the model, and should be identified and resolved. I don't think anyone could argue against that point at all.

However, when dealing with star schemas things are rather different. In a typical star schema-modelled data warehouse there will be a single set of conformed dimensions, each one denormalized to second normal form, and each one there in order to answer descriptive questions such as "who"? "what"? "where"? "when"? "why"? etc. There will also be a series of different fact tables for different subject areas, quite possibly held at different levels of aggregation, but all linking to the conformed dimensions or, on occasion, to unconformed dimensions if those dimensions are unique to a particular area.  However, this situation where multiple facts link to conformed dimensions gives rise to transitive relationships when viewed in the diagram. These are not an issue here, though, because the query engine generates queries based on the assumption that multiple facts can link to the same common dimensions, so they do not lead to unpredictable query generation and incorrect results. Yes they are loops, but no they do not need to be resolved.  Reporting from a single fact and multiple dimensions will not bring a loop into play. Reporting from multiple facts and one or more dimensions will result in a query split (stitch query).

I have often seen models where there is a foundation layer built on top of transactional tables, and an abstract layer above that where the data has been remodelled to represent fact and dimension tables. Latest guidelines suggest that no relationships are used in the abstract layer to allow better query minimization, but many models I have encountered do have relationships in the abstract layer, including the loops we are discussing between multiple facts and multiple conformed dimensions, and they produce accurate, predictable results.

My take on this is that loops are generally bad and need resolving, unless they are loops inherent in a Kimbal-esque star schema model, in which case they are perfectly fine.

Cheers!

MF.
Meep!

Lynn

I am an expert in everything and 100% correct 100% of the time  :o Anyone who believes this should contact me so I can tell you all about the Brooklyn bridge that is for sale ........

Here is a document that describes a multi-fact situation with TWO conformed dimensions (time and product):

http://publib.boulder.ibm.com/infocenter/c8fpm/v8r4m0/index.jsp?topic=/com.ibm.swg.im.cognos.ug_best.8.4.0.doc/ug_best_id1339multi-factmulti-grainquery.html

If it really were limited to just a single dimension then multi-fact querying would be virtually useless. You'd almost always need the time dimension and then you'd have no ability to assess product, or sales person, or employee, or ..... anything else. It would makes no sense to do it since very few business needs would involve multiple facts and only one dimension.

As further documentation, my course book has a section called "Identify SQL Generation in Complex Queries". Workshop 2 in this section involves an example where sales fact and returned items fact are used in a multi-fact modeling situation with both order method and time as conformed dimensions and return reason as an additional non-conformed dimension.