If you are unable to create a new account, please email support@bspsoftware.com

 

News:

MetaManager - Administrative Tools for IBM Cognos
Pricing starting at $2,100
Download Now    Learn More

Main Menu

Invoice Amt and Invoice Date in which table?

Started by Steve, 08 Sep 2017 01:32:55 PM

Previous topic - Next topic

Steve

I have a scenario where I have two tables. Invoice_Header ad Invoice_Line. The Invoice Date and Invoice Amt (total amt of all lines) is in Invoice_Header table. My question is - should we have the Invoice_Date in Invoice_Lines as the join in Framework should be between Dimension and Facts so the Date dimension should be joined to Invoice_Line?

What about Invoice_Amt? Should it be in the Header or just remove it from the Header? The Line amount is in the Invoice_Line so the total of all the lines would be the Invoice Amt.

Thanks

bdbits

You should not store the amount on Invoice Header if that is a dimension table.

Typically, measures (numbers like Invoice Amt) belong in fact tables, along with foreign keys to relevant dims. So a typical design would have
* Invoice Line - fact
* Invoice Header - dim - foreign key in Invoice Line fact
* Date - dim - foreign key in Invoice Line fact
* Customer or whatever dims you need as foreign keys in Invoice Line fact

If you want/need rolled up invoice amounts at the header level perhaps for performance, make a separate fact. You could call it Invoice Summary or something similar. You would put in your aggregates like Invoice Amount, maybe a count of lines, etc. Then link in the appropriate dims for invoice date, header info, customer, etc.

Steve

#2
Thanks for the answer. It does make sense. Now let me ask you this. Where would you put the Invoice Date? That's at the header level and if I put it in D_Invoice then it will be joined to an alias of a Date dimension which will also be a dimension so that will make it a Dim to Dim join.

Also, where would I put Line Description? There is a field in Invoice_Line called 'Description'.


Invisi

In the invoice and invoice line scenario, all dimensions originally belonging to the 'header' part, are modelled into your detail level. So the Invoice Date dimension is connected to your invoice line fact. I can recommend 'Star Schema' by Christopher Adamson. I think it's a great book about dimensional modelling and this very scenario is described there in the section about degenerate dimensions.
Few can be done on Cognos | RTFM for those who ask basic questions...

bdbits

Yep, what Invisi said. You pull dimensional links like dates down into to the line level fact.

Your line description may fall into a degenerate dimension as he said. But some tools are ultra-strict on dimensional modeling and will not work with them in a fact table, so sometimes you have to pull these attributes off into a junk dimension. So it ends up as a "depends" situation.

If you are somewhat new to all this and like books, Ralph Kimball and his associates wrote a bunch of books on various warehousing things, including the classic "The Data Warehouse Toolkit". I don't always agree with what he does, but it is a fine place to start. https://www.amazon.com/Ralph-Kimball/e/B000AP5ULY/ref=dp_byline_cont_pop_book_1
Also, the Kimball Group all retired but there is still a lot of good material at http://www.kimballgroup.com/category/articles-design-tips/.

Steve

#5
Quote from: bdbits on 11 Sep 2017 12:35:12 PM
You should not store the amount on Invoice Header if that is a dimension table.

Typically, measures (numbers like Invoice Amt) belong in fact tables, along with foreign keys to relevant dims. So a typical design would have
* Invoice Line - fact
* Invoice Header - dim - foreign key in Invoice Line fact
* Date - dim - foreign key in Invoice Line fact
* Customer or whatever dims you need as foreign keys in Invoice Line fact

If you want/need rolled up invoice amounts at the header level perhaps for performance, make a separate fact. You could call it Invoice Summary or something similar. You would put in your aggregates like Invoice Amount, maybe a count of lines, etc. Then link in the appropriate dims for invoice date, header info, customer, etc.

Ok, I will give you my scenario to explain what is causing the confusion. The conformed dimension modelling says that if you join two facts to a common dimension then two queries will be generated and then stitched together with a full outer join. So this means that if a filter is applied to query 1 then query 2 may still fetch all data because the filter did not get applied to query 2. Let's take an example:

Data Items:

Shipping Amt - at the Invoice Header level
Sale Amt    - at the Invoice Line level
Invoice date - at the Invoice Header level
Invoice Number - at the Invoice Header level
Invoice ID - at the Header Level but also present in Invoice Line

Now  we should create the following tables:

Invoice Dim - Invoice ID
Invoice Fact at Line level - Invoice ID, Invoice Date, Invoice Number, Sale Amt
Invoice Fact  at summary level - Invoice ID, Shipping Amt

Date Dim - joined to Invoice Fact at line level. so the joins will be like this:

                   
            ------------------------------                     ----------------                   -----------------------
            | Invoice Fact (Summary) |                    | Invoice Dim |                  | Invoice Fact (Line) |
            | --------------------------- |                    |-------------- |                  ---------------------   |
            |          Invoice ID            | -------------- |   Invoice ID |------------- |    Invoice ID           |
            |          Shipping Amt       |  Invoice ID    |                   | Invoice ID |   Invoice Number    |
            |                                     |                    |                   |                 |   Sale Amt              |
            ------------------------------                     ----------------                  |   Invoice Date Key  |
                                                                                                               ------------------------
                                                                                                                           |
                                                                                                                           |   Invoice Date Key = Date Key
                                                                                                                 ------------------
                                                                                                                 |    Date Dim    |
                                                                                                                 ------------------
                                                                                                                 |   Date Key      |
                                                                                                                 |   Date            |
                                                                                                                  -----------------

Now with this join , the Invoice Date presented to the user in the Business Layer will be the Date column coming from the Date Dim. So if the user makes a report to see the Sale Amt and the Shipping Amt for the last 7 days then he is going to apply a filter to the Invoice Date.

All joins are inner and there will be 2 queries generated:

Query 1 -  Invoice Dim, Invoice Fact (Line) and Date Dim
Query 2 - Invoice Dim and Invoice Fact (Summary)

The filter gets applied to Query 1 so it will show only invoices for the last 7 days and the Sale Amt of those Invoices. So far so good
However, Query 2 does not get filtered so it fetches all Invoices

Query 3 which is the outer query containing data from Query 1 and Query 2 with a full outer join shows Shipping Amt for all Invoices, not just last 7 days.

So what to do in this scenario?

bdbits

Rather than my ramblings, I'll direct you to a best practices document dealing with stitch queries. It's a bit older but I think everything is still applicable.
https://www.ibm.com/developerworks/data/library/cognos/reporting/advanced_report_design/page605.html

Steve

Thanks, although most people know it anyway. That document is about two facts but I was talking about having measures in header and Fact.

I did find the answer to it though !

bdbits

Measures should always be in fact tables.

Glad you got it working.

Invisi

Consider your issue is not about conformed dimensions and stitching, but about fact grain. If you make one fact with appropriate dimensions, your issue is gone. Conformed dimensions or stitching come into play when you want to compare real different stars. Think of invoicing and ordering. Or invoicing and sales.
Few can be done on Cognos | RTFM for those who ask basic questions...