COGNOiSe.com - The IBM Cognos Community

IBM Cognos 8 Platform => COGNOS 8 => Framework Manager => Topic started by: rpvs on 21 Jun 2011 03:39:13 PM

Title: Modelling Question
Post by: rpvs on 21 Jun 2011 03:39:13 PM
Our data is divided by a database name and data for both dimensions and facts. For example

Account Dimension table has column1 as database name and column2 as account code

Sub Account Dimension table has column1 as database name and column2 as Sub account code

Invoice Fact table has column1 as database name and column2 as  Revenue

Plan fact table has column1 as database name and column2 as  Plan amount

Currently have modelled both facts to both dimensions linking database name and key .

Example : Invoice fact is joined by database name and acctkey to account table.

For a plan vs invoice report listing  account , subaccount, Revenue and plan amount 
i am having to filter both dimensions and facts by database name ( 4 filters for database name) to get the correct results.

What would be the correct way to model so that i can pick one database name and filter it once instead of having to filter it  for every dimension and every fact???
Title: Re: Modelling Question
Post by: MFGF on 22 Jun 2011 04:24:07 AM
The short answer is you shouldn't need to.  If your relationships are defined correctly, a single filter should be all you need.

Just as a quick sanity check, make sure that you have only one relationship defined between each fact and each dimension, based on the two columns, and not two separate relationships.  If you have the latter, only one of the relationships will be used, which may explain the behaviour you are describing.

Regards,

MF.
Title: Re: Modelling Question
Post by: rpvs on 23 Jun 2011 08:14:01 AM
You are right i need not do that. I had other things with the data for plan fact which had multiple records for the same account code with a different plan code. Once i deleted the unnecessary plan data , i was able to get the results with just one filter.

Thanks for your response.
Title: Re: Modelling Question
Post by: cognostechie on 23 Jun 2011 12:08:16 PM
Keep in mind that with the join strategy you have, you will have Loop joins in the model (termed as Transitive Trap by Cognos). I know most people don't care for that becuase the reports are 'working' for the time being until they start seeing different results at different times for the same report.
Title: Re: Modelling Question
Post by: MFGF on 24 Jun 2011 05:11:51 AM
Quote from: cognostechie on 23 Jun 2011 12:08:16 PM
Keep in mind that with the join strategy you have, you will have Loop joins in the model (termed as Transitive Trap by Cognos). I know most people don't care for that becuase the reports are 'working' for the time being until they start seeing different results at different times for the same report.

This sounds to me like two star schemas using conformed dimensions, which I think should be perfectly ok. Any attempt to utilise both fact tables will result in a query split (stitch query) which will effectively break the loop anyway.  Good call though - loops can usually give issues, patricularly blind spots, so they should always be investigated and resolved as necessary.

Cheers!

MF.
Title: Re: Modelling Question
Post by: cognostechie on 24 Jun 2011 04:40:19 PM
Quote from: MFGF on 24 Jun 2011 05:11:51 AM
This sounds to me like two star schemas using conformed dimensions, which I think should be perfectly ok. Any attempt to utilise both fact tables will result in a query split (stitch query) which will effectively break the loop anyway.  Good call though - loops can usually give issues, patricularly blind spots, so they should always be investigated and resolved as necessary.

Cheers!

MF.

It is not two Star schemas using a confirmed dimension. That only happens when there are multiple facts joined to each other thru a single dimension. The moment you have more than one dimension between multiple facts, it will result in a Loop join.

The OPs original post says that he has both the Facts joined to both the dimensions (though it is interesting that Sub Account is not a child of Accounts and is joined to the Facts directly). So it will be like this -

                  |------------ INVOICE ---------------|                 
                  |                                                 |
             ACCOUNT                                 SUB ACCOUNT
                  |                                                 |
                  |-----------  PLAN  -------------------|

See the circle ?
Title: Re: Modelling Question
Post by: Lynn on 25 Jun 2011 06:08:19 PM
Hmmmmm.....

If both dimensions are conformed to both facts, then it is multi-fact situation around conformed dimensions. Yes, it is a loop, but the cardinality identifies the facts (those with 1..n) and will handle it accordingly. This means a stitch query or else full outer join if the database supports it.

You can have more than one dimension conformed to multiple facts as far as I know. I would agree that account and sub-account sound more like a hierarchy of a single dimension rather than two different dimensions.


             1..n |------------ INVOICE ---------------| 1..n
                  |                                    |
                  |                                    |
             1..1 |                                    | 1..1
                  |                                    |
             ACCOUNT                                 SUB ACCOUNT
                  |                                    |
             1..1 |                                    | 1..1
                  |                                    |
                  |                                    |
             1..n |-----------  PLAN  -----------------| 1..n
Title: Re: Modelling Question
Post by: RobsWalker68 on 27 Jun 2011 03:26:08 AM
Hi,

I agree yes that technically it's a loop join, but a loop join that Cognos explicitly knows how to handle so not a modelling issue.

Note from Framework Manager Guide:

"Loop joins in the model are typically a source of unpredictable behavior. This does not include star schema loop joins.

Note: When cardinality clearly identifies facts and dimensions, IBM Cognos 8 can automatically resolve loop joins that are caused by star schema data when you have multiple fact tables joined to a common set of dimension tables."

Rgds

Rob

Title: Re: Modelling Question
Post by: cognostechie on 27 Jun 2011 04:00:51 PM
Cardinality only identifies which will be treated as a Fact and which will be treated as a Dimension. It has nothing to do with handling loop joins.

Rob - You are almost there. It is the unpredictable  behaviour in generating the SQL due to which loop joins are supposed to be avoided. Reason - When cognos generates the SQL, sometimes it would say

where invoice.fieldname = account.fieldname and plan.fieldname = account.fieldname

at other time, the same report would generate an sql that would say

where invoice.fieldname = subaccount.fieldname and plan.fieldname = subaccount.fieldname

Would the data set be correct - probebly yes.
Is it considered a good practice to take the risk - NO

That's the point !

and by the way, this situation is NOT a Star Schema so that does not apply here.
Title: Re: Modelling Question
Post by: RobsWalker68 on 27 Jun 2011 04:54:15 PM
Hi,

Ok just to clarify the point so we are talking about the same thing. With regards to the joins and cardinality detailed by Lynn in her last post.  Would you consider that to be a loop join?

I'm also a bit confused why this isn't a star schema so perhaps I've misunderstood and you could clarify.  From the previous posts we appear to have two facts (Invoice, Plan) joined by two conformed dimensions (Account, Sub Account).  That is clearly a star schema so what am i missing?  And if it is as I've detailed above it cannot possibly be a loop join.

Confused !

Rgds

Rob
Title: Re: Modelling Question
Post by: cognostechie on 27 Jun 2011 05:25:04 PM
Yes, I would definately consider that to be a Loop join.

The preferred way would be to create one query subject containing Accounts and Sub Accounts and
then join that to both the Facts. That would resemble a Star Schema. If that is not possible, I would
delete the join between Sub Account and Plan Fact, create a shortcut of Sub Account and join Plan Fact
to that Shortcut. That's exactly what 'alias Shortcut' is meant for in FM.

So it would be like this:

Accounts  ---------  Invoice -----------  Sub Account   
    |
    |
Plan ------------ Sub Account
Title: Re: Modelling Question
Post by: cognostechie on 27 Jun 2011 05:30:21 PM
To answer your question  about why it is not a Star Schema -

In a Star Schema, the attributes of a dimension are de-normalised so there should
have been one table containing Accounts and Sub-Accounts. That way there would have
been only one join between both the Facts which would have eliminated the Loop join
completely ! 

You are probably thinking that a Schema is considered a Star as long as joins are between Facts
and Dimensions. Well, the attributes also have to be collapsed to form a Dimension. A Dimension is
a collection of all attributes pertaining to that area. In this case there is no Dimension because Accounts
and Sub Accounts are seperate ! It is currently in a normalised form. By denormalising, we can create an Accounts Dimension and then join that to both the Facts.

A join is a Loop join when the multiple Facts go in a circle (coming back to itself).
Title: Re: Modelling Question
Post by: RobsWalker68 on 27 Jun 2011 05:38:41 PM
Hi,

Well I will give you that is certainly an interesting and unique approach to handle it.  Why are you doing it that way though when if you set the correct cardinality Cognos will handle the loop join itself as it recognises it as a star schema?

Cognos will create a result set out of

1.Account, Invoice, Sub Account
2.Account, Plan, Sub Account

and then stich the two results together hence no loop join.

rgds

Rob
Title: Re: Modelling Question
Post by: RobsWalker68 on 27 Jun 2011 05:43:43 PM
Hi,

Hopefully our posts won't cross this time.

The fact that Account and Sub Accounts are modelled seperately in two dimensions is just semantics.  We may agree that perhaps they should be in one dimension but as long as the fact tables have seperate surrogate keys for Account and Sub Account then its really not here nor there. 

Rgds

Rob
Title: Re: Modelling Question
Post by: cognostechie on 27 Jun 2011 05:49:10 PM
You are looking at it in a specific way. Modelling is not done keeping in mind what the report will contain.
You are presuming that the report will contain fields from all 4 query subjects.

What will happen if a user makes a report containing only 2 fields?

Account Nos, Sub Account Nos.

and I am not sure that Cognos will consider and handle it like a Star schema with these joins. Have you
tested it that way?

Modelling has to be done keeping in mind that it will work for all scenarios.
Title: Re: Modelling Question
Post by: cognostechie on 27 Jun 2011 05:54:57 PM
Forgot to add that there also has to be determinants set on dimensions when they are joined to multiple facts
otherwise the totals/sub-totals will get screwed up like we discussed in other posts. In this case, the determinants need to be set in both Accounts and Sub Accounts which will screw things up further.

In one of the posts for determinants, I have posted screen shots showing how accurately Cognos rolls up data
at all levels of granularity when determinants are set properly vs the wrong rollup when determinants were not set properly.
Title: Re: Modelling Question
Post by: RobsWalker68 on 27 Jun 2011 06:01:06 PM
Quote from: cognostechie on 27 Jun 2011 05:49:10 PM
What will happen if a user makes a report containing only 2 fields?
Account Nos, Sub Account Nos.

It will create a cross join of the Account Dimension and Sub Account Dimension as neither fact table is involved, however that would be true of any star schema design and it would be the same as me writing SQL

Select AccNo, SubAccNo
From Account Dim, SubAccountDim

However if you are modelling facts even factless facts you are going to be using a measure from the fact table.

Modelling doesn't have to match all scenario's it has to match all realistic scenario's that would make sense with regards to the process you are modelling. Otherwise you would have to model scenario's like the user wants to combine the product dimension with the calendar dimension.

Rgds

Rob

Title: Re: Modelling Question
Post by: RobsWalker68 on 27 Jun 2011 06:03:52 PM
Yes I agree determinants are important if you joining facts which have different levels of granularity


Rgds


Rob
Title: Re: Modelling Question
Post by: cognostechie on 27 Jun 2011 06:07:04 PM
Quote from: RobsWalker68 on 27 Jun 2011 06:01:06 PM

Modelling doesn't have to match all scenario's it has to match all realistic scenario's that would make sense with regards to the process you are modelling. Otherwise you would have to model scenario's like the user wants to combine the product dimension with the calendar dimension.

Rgds

Rob

In that case, I have nothing more to say ! Your approach is the approach taken by common modellers who keep changing the model for every report as they go on !

and you obviously depend only the user guides so you are not aware of the implications of blind spots, loop joins etc. They say that very clearly in Cognos that Loop join sare a complete NO NO ! It is called a Transitive Trap and it is recommended to resolve that. There is a seperate chapter called 'Modelling for Predictable Results' in the class conducted by Cognos.

There is a general way to do something and there is a right way to do it ! I prefer the right way !

We have different ways of working ..
Title: Re: Modelling Question
Post by: RobsWalker68 on 27 Jun 2011 06:18:44 PM
Hi,

What would lead you to draw that conclusion as really the approach I have outlined is  just the Kimball approach and with regards to framework design question the approach adopted by IBM. 

I have tended to find that following these approaches leads to pretty stable designs and to be honest I have never had to change my report designs as I go on.

Anyway interesting chat

Kind Regards

Rob 
Title: Re: Modelling Question
Post by: RobsWalker68 on 27 Jun 2011 06:28:20 PM
Quote from: RobsWalker68 on 27 Jun 2011 06:18:44 PM
There is a seperate chapter called 'Modelling for Predictable Results' in the class conducted by Cognos. There should never be more than one dimension linked to multiple facts !

That would make a great Business Intelligence system !!

Lol, now I know that someone is trying to wind me up  ;D

Rob
Title: Re: Modelling Question
Post by: cognostechie on 27 Jun 2011 11:03:21 PM
I would invite the moderators to comment on this. How many FM experts agree that Cognos does not has a chapter on 'Modelling for Predictable Results' ?

Also comment on what is a Loop join in Framework terminology? People here think that if all 4 query subjects are not involved in a single SQL then it's not a Loop join !!! That's NOT what a Loop join means in FM terminology. It is still a Loop join in FM no matter what kind of SQL it generates. The bottomline is NOT to have more than ONE way to reach from Point A to Point B.

If IBM adopts that approach (as Rob mentions), I would love to get some evidence of that ! I have documentary evidence to prove that IBM does NOT support that approach. It is mentioned very clearly in that chapter that I mentioned.
Title: Re: Modelling Question
Post by: RobsWalker68 on 28 Jun 2011 03:43:30 AM
I will happily concede the point about the course chapter as I over extended the quote, although it was 1am so hopefully you will forgive me for that one.

I was intending to call out the quote: "There should never be more than one dimension linked to multiple facts !" 

That is situation that doesn't apply where you have multiple fact tables linked through conformed dimensions.

As I'm a user guide kind of guy I've reproduced the full IBM quote for clarity.  This is found in the Framework Manager Guide -> Guidelines for Modelling Meta Data -> Building a Relational Model -> Loop Joins

"Loop joins in the model are typically a source of unpredictable behavior. This does not include star schema loop joins. Note: When cardinality clearly identifies facts and dimensions, IBM Cognos 8 can automatically resolve loop joins that are caused by star schema data when you have multiple fact tables joined to a common set of dimension tables."

Now there may be a point about merging the Account and Sub Account dimension into a single entity, however the originator of the post gave no information on what the pupose of the Account/Sub Account is.  Although the names make it sound like they should be merged we cannot infer that from the information given.  For all we know Account may relate to Customers and sub account to a Chart of Accounts - who knows, not us for sure!

I have no intention to mislead people and I respect your opinion cognostechie although I do believe that in this instance you have slightly misunderstood how Cognos handles reporting between multi facts. 

At the end of the day if you want to see how Cognos handles multi fact joins that look on paper as Loop joins run it up in Cognos using the sample pacakages.

Create a report with a couple of conformed dimensions of your choice say Product & Retailer and a couple of facts say Sales & Sales Forecast and look at the generated SQL.  I am confident that you will see a stiched query and not a loop join.  You need to make sure you have fact detection enabled in your model of course

Kind Regards

Rob     











Title: Re: Modelling Question
Post by: MFGF on 28 Jun 2011 04:43:25 AM
Quote from: cognostechie on 27 Jun 2011 11:03:21 PM
I would invite the moderators and the experts from BSP to comment on this. How many FM experts agree that Cognos does not has a chapter on 'Modelling for Predictable Results' ?

BSP --- Anyone from your team attended classes at Cognos ?? Come on in, comment on this..

Also comment on what is a Loop join in Framework terminology? People here think that if all 4 query subjects are not involved in a single SQL then it's not a Loop join !!! That's NOT what a Loop join means in FM terminology. It is still a Loop join in FM no matter what kind of SQL it generates. The bottomline is NOT to have more than ONE way to reach from Point A to Point B.

and if IBM adopts that approach (as Rob mentions), I would love to get some evidence of that ! I have documentary evidence to prove that IBM does NOT support that approach. It is mentioned very clearly in that chapter that I mentioned.

There is no need to mislead people on this forum, hence I am mentioning all this as misleading other people might result in wrong results on their side.


MFGF, PaulM, Blom, Reportnet Addict, BSP experts -- Come on in !!!!! Share your suggestions !!

Well, you asked so here's my opinion (for what it's worth! :) )

Firstly, I wouldn't class myself as an expert.  I do have some experience though, and I did spend over a decade teaching Cognos classes, so I know the official documented approaches quite well (and they have changed over the years too - as you might notice if you read some of my early posts from back when this forum first started! :) ). I don't think Rob was saying that IBM/Cognos do not have an approach and documented guidelines to modelling for predictable results - simply that their guidelines do not imply that multiple facts should never link to more than one dimension.

Being purely pedantic, a transitive relationship trap (loop) is a situation where the relationships in the model result in there being more than one possible way of linking two query subjects. In a foundation layer when dealing with structures typically derived from a normalized transactional database, these can lead to unpredictable query results and blind spots in the model, and should be identified and resolved. I don't think anyone could argue against that point at all.

However, when dealing with star schemas things are rather different. In a typical star schema-modelled data warehouse there will be a single set of conformed dimensions, each one denormalized to second normal form, and each one there in order to answer descriptive questions such as "who"? "what"? "where"? "when"? "why"? etc. There will also be a series of different fact tables for different subject areas, quite possibly held at different levels of aggregation, but all linking to the conformed dimensions or, on occasion, to unconformed dimensions if those dimensions are unique to a particular area.  However, this situation where multiple facts link to conformed dimensions gives rise to transitive relationships when viewed in the diagram. These are not an issue here, though, because the query engine generates queries based on the assumption that multiple facts can link to the same common dimensions, so they do not lead to unpredictable query generation and incorrect results. Yes they are loops, but no they do not need to be resolved.  Reporting from a single fact and multiple dimensions will not bring a loop into play. Reporting from multiple facts and one or more dimensions will result in a query split (stitch query).

I have often seen models where there is a foundation layer built on top of transactional tables, and an abstract layer above that where the data has been remodelled to represent fact and dimension tables. Latest guidelines suggest that no relationships are used in the abstract layer to allow better query minimization, but many models I have encountered do have relationships in the abstract layer, including the loops we are discussing between multiple facts and multiple conformed dimensions, and they produce accurate, predictable results.

My take on this is that loops are generally bad and need resolving, unless they are loops inherent in a Kimbal-esque star schema model, in which case they are perfectly fine.

Cheers!

MF.
Title: Re: Modelling Question
Post by: Lynn on 28 Jun 2011 08:10:02 AM
I am an expert in everything and 100% correct 100% of the time  :o Anyone who believes this should contact me so I can tell you all about the Brooklyn bridge that is for sale ........

Here is a document that describes a multi-fact situation with TWO conformed dimensions (time and product):

http://publib.boulder.ibm.com/infocenter/c8fpm/v8r4m0/index.jsp?topic=/com.ibm.swg.im.cognos.ug_best.8.4.0.doc/ug_best_id1339multi-factmulti-grainquery.html

If it really were limited to just a single dimension then multi-fact querying would be virtually useless. You'd almost always need the time dimension and then you'd have no ability to assess product, or sales person, or employee, or ..... anything else. It would makes no sense to do it since very few business needs would involve multiple facts and only one dimension.

As further documentation, my course book has a section called "Identify SQL Generation in Complex Queries". Workshop 2 in this section involves an example where sales fact and returned items fact are used in a multi-fact modeling situation with both order method and time as conformed dimensions and return reason as an additional non-conformed dimension.

Title: Re: Modelling Question
Post by: cognostechie on 28 Jun 2011 11:02:13 AM
So if you had 25 dimensions and 3 facts, you would have multiple conformed dimensions (all 25 of them) or would you look into each one of them and then resolve it as you 'think' it might pose a problem?

I would rather have only one conformed dimension and use the concept of 'alias shortcut'. Yes, most busineses do have the need of more than one dimension, hence the concept of 'alias shortcut'. Not sure if everybody knows about it ! That might be leading to the belief that all the dimensions have to be linked to all the facts !

I would rather have a model that has a consistant modelling technique instead of having a few dimensions as common and others as seperate for the facts.

I understand the stitched-query generation but that is limited to the scenario when you have all the dimensions and facts on the same schema in the DB ! In the case of having dimensions in seperate schemas and Fact on only one schema, Cognos did produce wrong results and it was only resolved using alias shortcuts and having only one conformed dimension !

I would still use alias shortcuts instead of having multiple conformed dimensions. That way, I would also have a true Star Schema design in the Business Layer !
Title: Re: Modelling Question
Post by: MFGF on 28 Jun 2011 01:15:24 PM
Quote from: Lynn on 28 Jun 2011 08:10:02 AMI am an expert in everything and 100% correct 100% of the time  :o Anyone who believes this should contact me so I can tell you all about the Brooklyn bridge that is for sale ........

Ummmmmmm...... Yes. Absolutely. Ahem. Moving swiftly on... ;) :D

Quote from: cognostechie on 28 Jun 2011 11:02:13 AM
So if you had 25 dimensions and 3 facts, you would have multiple conformed dimensions (all 25 of them) or would you look into each one of them and then resolve it as you 'think' it might pose a problem?

I would rather have only one conformed dimension and use the concept of 'alias shortcut'. Yes, most busineses do have the need of more than one dimension, hence the concept of 'alias shortcut'. Not sure if everybody knows about it ! That might be leading to the belief that all the dimensions have to be linked to all the facts !

I would rather have a model that has a consistant modelling technique instead of having a few dimensions as common and others as seperate for the facts.

I understand the stitched-query generation but that is limited to the scenario when you have all the dimensions and facts on the same schema in the DB ! In the case of having dimensions in seperate schemas and Fact on only one schema, Cognos did produce wrong results and it was only resolved using alias shortcuts and having only one conformed dimension !

I would still use alias shortcuts instead of having multiple conformed dimensions. That way, I would also have a true Star Schema design in the Business Layer !

Creating alias shortcuts to a single, amassed dimension is certainly not in the best practice guidelines provided by IBM (or even going back in time to Cognos and the days of ReportNet).  It's quite a radical modelling approach, and not one I'd necessarily suggest as a normal way of modelling in Framework Manager.

I'm sure it could be made to work, but the query engine hasn't been specifically optimised for it, and it may cause big challenges to other modellers later on that have been taught the best practice approach if they have to pick up your work then figure out what has been done in this model.

A collection of 25 conformed dimensions isn't unheard of, although it's unlikely a single fact table would link to all of these. The number you end up with is driven by the business requirements for reporting and the way in which the business categorises things.

Again, these are just my own thoughs, and not intended as a guide to how things should or must be done. Just the humble opinion of an old muppet :-)

MF.
Title: Re: Modelling Question
Post by: Lynn on 28 Jun 2011 01:55:43 PM
I'm confused on a few points.

Quote from: cognostechie on 28 Jun 2011 11:02:13 AM
I understand the stitched-query generation but that is limited to the scenario when you have all the dimensions and facts on the same schema in the DB

Same schema? As in Oracle schema? Whenever I've had multiple Oracle schemas to contend with I'd have synonyms set up such that I'm working with a single data source definition and no need to query with schema name prefixing, so perhaps I haven't faced what you are describing.

Otherwise, my understanding is that you'll get a stitch query whenever you have a multi-fact query with conformed dimensions and it will be processed locally on the Cognos server, unless the database supports full outer joins, in which case it could be generated as a full outer join SQL statement and executed on the database.

Quote from: cognostechie on 28 Jun 2011 11:02:13 AM
I would still use alias shortcuts instead of having multiple conformed dimensions. That way, I would also have a true Star Schema design in the Business Layer !

I use alias shortcuts often, generally for role playing dimensions. In the case of a multi-fact modeling scenario, let's say you'd have time as the one dimension conformed to three different fact tables. Then let's say you have product as an alias shortcut, therefore three different product dimensions (one per fact table). Is that what you are describing?

If so, how do you present a product model query subject to your users? It seems instead of having a single model query subject to house the product attributes you'd need to have three of them, each one identical except referencing a different data source query subject (e.g., Product Sales, Product Returns, Product Forecast). Or else you'd need all the various product attributes baked into each of the fact model query subjects. Either way, any modeling change would need to be done three times in three places. More importantly, as a business user creating a report that ought to include product name, which one do I use and when? If I did have 25 conformed dimensions, imagine how quickly the number of attributes would multiply.

Maybe I'm misunderstanding the alternative you suggest....been known to happen! clearly I fibbed on the 100% correct comment  :-[

Anyway, this has been a very interesting discussion and I enjoy hearing everyone's thoughts! I am always interested to find out different ways to approach problems!
Title: Re: Modelling Question
Post by: cognostechie on 28 Jun 2011 02:15:11 PM
Both of you are speaking about a straight-simple model.

Best practices and normal way of modelling is a term you are probably using depending on how you define these things yourself rather than the practices suggested by IBM. I didn't find any documentation from Cognos/IBM saying that alias shortcuts are not a good approach! Can you share that document MFGF?

Creating 3 different shortcuts won't need to make the change in 3 query subjects. You just have to make the change in the original Product dimension and it will be reflected in all 3 shortcuts. They will just have different join strategy.

The Presentation Layer will look like this:

Sales Fact
    Product Dim
    Customer Dim
   

Returns Fact
    Product Dim
    Customer Dim
   

Forecast Fact
    Product Dim
    Customer Dim
   

Common Dimension
    Time Dim   


In this case, you have to first make the Product Dimension by merging all attributes of Product which might
be in 3/4 different tables.
Again, I am talking about modelling from highly normalised schemas like an ERP or CRM. Modelling from Star Schemas is a no-brainer as most of the stuff is already taken care of by the DW guys.   
Title: Re: Modelling Question
Post by: RobsWalker68 on 28 Jun 2011 02:16:18 PM
Hi Cognostechie,

I will admit I have been schooled in the Kimball method and that is how I approach all my data warehouse designs, although I am happy to accept there will be other ways of handling the modelling of an enterprise wide data warehouse. However the Kimball methodology, as adopted and implemented in BI query engines and database optimisers does give consistent results and consistent access paths.

If you adopt the Kimball triage approach to designing your data mart using the BI Bus to create an integrated enterprise wide data warehouse then you will have a fact table with measures and a series of surrogate keys that will link to your conformed dimensions.  The conformed dimension will enable you to tie the individual datamarts together so you can report across the enterprise and not create stovepipe (isolated) datamarts.

Below is a link of what to avoid in creating a datawarehouse

http://www.kimballgroup.com/html/articles_search/articles2001/0110IE.html?TrkID=IE200110

The biggest mistake being fail to conform your dimensions -

Mistake 1: Fail to conform dimensions across separate fact tables. The single most important design technique in the dimensional modeling arsenal is conforming your dimensions. If two or more fact tables have the same dimension then you must be a fanatic about making these dimensions identical or carefully chosen subsets of each other. When you conform your dimensions across fact tables, you will be able to drill across separate data sources because the constraints and row headers will mean the same thing and will match at the data level. Conformed dimensions are the secret sauce needed for building distributed data warehouses, adding unexpected new data sources to an existing warehouse, and making multiple incompatible technologies function together harmoniously.

I have used alias shortcuts or "role playing dimensions" as called in the Kimball methodology. However, they are used for a very specific modelling scenario where you wish to create a copy of an existing dimension but have it interpreted by the quert engine as a different dimension.  For instance your fact record has two date fields "shipped from date" and "shipped to date".  You can't link both fields to the same date dimension and therefore you use a role playing or alias shortcut to create two copies of your calendar that are intepreted seperately by the query engine.  It is not an approach you would implement generally to handle loop joins.

Like I say I may be old school Kimball and if your approach works for you then carry on I'm just concerned you may be creating a lot of extra modelling work that is not entirely necessary.

Kind Regards

Rob









Title: Re: Modelling Question
Post by: Lynn on 28 Jun 2011 03:13:32 PM
Quote from: cognostechie on 28 Jun 2011 02:15:11 PM
Both of you are speaking about a straight-simple model.

I WISH!!! Currently I'm modeling for an operational source and have multi-fact queries working properly. It is most certainly not limited to a simple star schema where the DW guys have already done the heavy lifting.

In my case I have relationships in the physical layer and need only a single presentation layer query subject for product that works with all facts it conforms to. Not sure if your shortcuts are for the base database tables or the Product Dim in your business layer. But I would find three different representations of the same thing a point of confusion for many users.

What is very interesting to me is how a multi-fact query might be generated in your scenario. If a query in your model used sales amount, forecast amount, Month from the Time dimension and Product Name from, say, the Sales Product Dim, then wouldn't the Month be the only one to get a coalesce?

The Product Name (no matter which of the two you picked) would be viewed as non-conformed I think. Therefore it wouldn't get a coalesce and would only be queried for the fact it relates to (sales in this example).

I'm about ready to head home for the day and my brain is addled by now, but it seems to me you'd get SQL generated that will give you one query from sales with month and product, and then another query from forecast with just month. They get full outer joined and the sum of all product forecasts for the month gets joined to the sales by product for the month. This would be wrong. I just pulled up my trusty training guide and reviewed the section "Identify non-conformed dimensions in Generated SQL".

If someone got confused and picked Product Name from Return Fact Product Dim instead of either of the other two where you've selected facts then heaven only knows what SQL would turn up!

A single fact query in your environment would work perfectly correct as far as I can see, but I'm not quite seeing how multi-fact would work for you.

Of course I don't know all the nuance you probably have going on, but remain interested by the topic :D
Title: Re: Modelling Question
Post by: cognostechie on 28 Jun 2011 03:55:48 PM
Yes, you are correct ! The SQL generated is crappy, the data in the report is wrong and I have been fired from the project !!  ;D

Anyway, I guess the OPs post was really a very simple model and hence all this is not required here ! Rob, you were probably correct and I went too far to analyse things.
Title: Re: Modelling Question
Post by: jive on 28 Jun 2011 03:59:40 PM
Hi,

Just my experience,
In the projet I work now we have 96  dimensions on those dimensions 68 are conform business wise dimension all the alias (35) are done thru materialized view in the oracle database and the other , 28, are restrict to certain fact table. Most of the 28 are needed more for description-information on operational behavior then real business analysis.

With that model, for the moment, I can resolve all the requirement in small effective and reporting package. 

I can produced one package with everything , all fact and conform dimension related. OK may be it's long to do crosstab with 10 level on each side.. but at least I can do it ;-) and validate my result with an SQL query directly on the database.

With those 68 conforms dimension I can scroll around 10 to 20 fact table by theme , it's around 110-120 measure and explain close to 95% of the business.

I have define at this moment 40 packages including or removing dimension and fact based on the business requirement. To be honnest, the important thing: that's work an it's work really well ,fast and reliable, even SAS , can connect directly to the datawarehouse source table and get the same result as Cognos for deeper statistical analysis.

Sorry to don't know more about loop chain query and stichy query ... But about conform dimension I used then a lot  :P

Regards Jacques   ;D
Title: Re: Modelling Question
Post by: RobsWalker68 on 28 Jun 2011 04:16:52 PM
Thanks CT for the reply and it has been interesting discussing things with you.  Back to watching series 6 of the West Wing - Told you I was old school  ;D

There are many ways to skin the same cat

All the best

Rob
Title: Re: Modelling Question
Post by: Lynn on 28 Jun 2011 04:42:23 PM
Quote from: cognostechie on 28 Jun 2011 03:55:48 PM
Yes, you are correct ! The SQL generated is crappy, the data in the report is wrong and I have been fired from the project !!  ;D

Anyway, I guess the OPs post was really a very simple model and hence all this is not required here ! Rob, you were probably correct and I went too far to analyse things.

I don't mean to be annoying or irritating....it just comes naturally....I don't even have to try  ;)

I only wanted to understand and I hope you didn't take offense! I learn so much from all the very smart people here and constantly challenge my own assumptions and understanding when I hear about different approaches.

There are certainly many ways to skin the cat as Rob says.

That is an awful expression if you think about it and I'm not even all that crazy about cats!  :o
Title: Re: Modelling Question
Post by: cognostechie on 28 Jun 2011 04:52:19 PM
Quote from: RobsWalker68 on 28 Jun 2011 04:16:52 PM
Thanks CT for the reply and it has been interesting discussing things with you.  Back to watching series 6 of the West Wing - Told you I was old school  ;D

There are many ways to skin the same cat

All the best

Rob

Yup ! It was definately interesting ! If you ever have to model using an ERP as the backend, just try my method. It will make the model more organised even if it's additional work and most importantly, it will eliminate those stitched queries with full outer joins so you don't have to take a nap till the report comes up. Querying ERP schemas that have so much of normalization and indexing not done for reporting purpose makes full outer joins very slow. 
Title: Re: Modelling Question
Post by: cognostechie on 28 Jun 2011 04:53:36 PM
Quote from: Lynn on 28 Jun 2011 04:42:23 PM
I don't mean to be annoying or irritating....it just comes naturally....I don't even have to try  ;)

I only wanted to understand and I hope you didn't take offense! I learn so much from all the very smart people here and constantly challenge my own assumptions and understanding when I hear about different approaches.

There are certainly many ways to skin the cat as Rob says.

That is an awful expression if you think about it and I'm not even all that crazy about cats!  :o

No worries ! I intentionally started it as I was getting bored  ;)

Just kidding..
Title: Re: Modelling Question
Post by: RobsWalker68 on 28 Jun 2011 05:16:52 PM
Quote from: cognostechie on 28 Jun 2011 04:52:19 PM
Yup ! It was definately interesting ! If you ever have to model using an ERP as the backend, just try my method. It will make the model more organised even if it's additional work and most importantly, it will eliminate those stitched queries with full outer joins so you don't have to take a nap till the report comes up. Querying ERP schemas that have so much of normalization and indexing not done for reporting purpose makes full outer joins very slow.

Your going to jinx me now and my next contract is going to be working on some damn awlful SAP ERP system  ;)
Title: Re: Modelling Question
Post by: MFGF on 29 Jun 2011 05:02:11 AM
This has been a very interesting discussion, and I have really enjoyed reading the different views and seeing things from other perspectives.  It's one of the great things about forums like this that folks can kick ideas about without fear of ridicule and air views based on real experience rather than just what the manual or (ex) class instructor says.  Reading threads like this one reminds me that there are some very, very smart people on Cognoise, and I feel extremely privileged to be a part of this great community!

Lynn, CT, Rob, Jive - you have my complete admiration!

MF.
Title: Re: Modelling Question
Post by: PRIT AMRIT on 29 Jun 2011 10:08:31 PM
I am very much agree with MFGF.

In short, this forum has been a knowledge treasury for me and I am very thankful to all of you sharing your ideas and experience without any hesitation.

THANK YOU ALL.

Prit