If you are unable to create a new account, please email support@bspsoftware.com

 

News:

MetaManager - Administrative Tools for IBM Cognos
Pricing starting at $2,100
Download Now    Learn More

Main Menu

Joins in Database or Logical layer?

Started by Invisi, 03 Feb 2017 03:04:57 AM

Previous topic - Next topic

Invisi

I am rather confused about in which layer to put joins. I used to do it in the Logical layer. Then one client did it in the Database layer. When I investigated it then, I found an article from one of the more well known US companies that offers Cognos services, that they indeed say to put the joins (and so all the duplicates of the Date dimension and other multiple dimensions) in the database layer. Now I am starting over with a framework at a new client, so I check again. And I find another article that says again to put the joins in the Logical layer. I am now restarting with joins (and duplicate dimensions) in the Logical layer.

Who uses which variants and why?
Few can be done on Cognos | RTFM for those who ask basic questions...

MFGF

Quote from: Invisi on 03 Feb 2017 03:04:57 AM
I am rather confused about in which layer to put joins. I used to do it in the Logical layer. Then one client did it in the Database layer. When I investigated it then, I found an article from one of the more well known US companies that offers Cognos services, that they indeed say to put the joins (and so all the duplicates of the Date dimension and other multiple dimensions) in the database layer. Now I am starting over with a framework at a new client, so I check again. And I find another article that says again to put the joins in the Logical layer. I am now restarting with joins (and duplicate dimensions) in the Logical layer.

Who uses which variants and why?

Hi,

Best practice is to add the joins in the database layer, between the data source query subjects, if you can. If you add a join to a model query subject in the logical layer, and that model query subject spans multiple underlying data source query subjects, you reduce query minimisation within your reports - an item brought in from that query subject will always result in a query to all of the underlying query subjects feeding into it.

Just my humble opinion :)

MF.
Meep!

Michael75

@ Invisi
Excellent question!

I've wondered about this, having noticed that expert sites or bloggers seem to be pretty well split between the two approaches  ???  Personally, I've used both, at different sites, and haven't found strongly in favour of either.

Two factors that may influence your choice... or maybe factoids rather than factors, as I'm not 100% sure of them:

1. If you use determinants, relationships must be set in the same layer as that in which determinants are set.

2. If you set relationships in the database layer, they can be overridden by others that are set in the logical / business layer. In other words, it's the latter which are used in the SQL generation. Which means that if you are looking at the database layer in FM, you may (wrongly) conclude that the relationships you see are those which are going to be used.

I welcome all corrections to the above assertions  8)

Invisi

MFGF, are you talking about snowflaking? I don't snowflake as a rule. I also have no bridge tables between fact tables and regular dimensions in any way.
Few can be done on Cognos | RTFM for those who ask basic questions...

MFGF

Quote from: Invisi on 03 Feb 2017 08:19:49 AM
MFGF, are you talking about snowflaking? I don't snowflake as a rule. I also have no bridge tables between fact tables and regular dimensions in any way.

Not specifically, but snowflake dimensions would fit the paradigm. You want items from all the flake tables in a single dimension query subject, so your database layer has (eg) three data source query subjects - one for each flake table - and your logical layer has one model query subject containing items from all three data source query subjects. Obviously the three would need joins between them in the database layer (else you'd get cross-join errors when trying to bring in items to your one logical layer query subject). However, if you join the model query subject in the logical layer to something else (eg a fact query subject in the logical layer) you then force Cognos to issue a query spanning all three tables whenever you reference an item from the dimension in the logical layer. If you don't add joins in the logical layer, the query is minimised as much as possible, so if only one of the three tables needs to be read, this is what happens. If your model is over a pure star schema and your logical layer query subjects correspond exactly with your database layer query subjects, it's not really going to make any difference where you add the joins, apart from not adhering to the purity of the best practice guidelines if you put them in the logical layer :)

MF.
Meep!

Invisi

But I see conflicting best practise guidelines!!!!!  >:( :-\

So basically the quality of the data warehouse is a factor in where to put the joins?
Few can be done on Cognos | RTFM for those who ask basic questions...

MFGF

Quote from: Invisi on 09 Feb 2017 03:26:22 AM
But I see conflicting best practise guidelines!!!!!  >:( :-\

So basically the quality of the data warehouse is a factor in where to put the joins?

Nope. No conflict. Put the joins in the database layer. Works for all situations :)

MF.
Meep!

rockytopmark

This is one of those topics that can spawn good debate.  While there are a few different strategies available, I don't think 1 is necessarily the best for 100% of the crowd, nor are the others "wrong" per se... but certain guidelines should be more the rule:


  • Relationships in 1 and only 1 layer
  • Determinants in 1 and only 1 layer, same as relationships

Back in 2008, an IBM Global Services consultant shared a Best Practices (INTERNAL EYES ONLY) document that described an approach I adopted after reading it.

Database Layer

  • Datasource Query Subjects
  • Select * From ... only

Modeling Layer

  • Model Query Subjects built from Objects in Database Layer
  • Business Term renaming**
  • Role Playing Dimensions
  • Relationships
  • Determinants

Dimensional Layer  - (Optional... only if doing DMR)

  • Regular Dimensions built from Modeling Layer
  • Measure Dimensions built from Modeling Layer

Business (Presentation) Layer  - (Optional)

  • Logical groupings of objects
  • Reference Shortcuts only


** I have done renaming in Database Layer as well... I don't recall whether I wound up regretting that decision or not.  Where you have many Role playing dimensions such as Date, it might make the naming in Modeling layer easier to have done the main renaming in Database Layer first.

Just my $0.02

Invisi

See, in this best practise, I join in the logical layer.
Few can be done on Cognos | RTFM for those who ask basic questions...

MFGF

Quote from: Invisi on 09 Feb 2017 08:24:44 AM
See, in this best practise, I join in the logical layer.

My earnest advice is not to. This can lead to reduced query minimization (and therefore less efficient queries).

Just my tuppence :)

MF.
Meep!

Lynn

Quote from: MFGF on 09 Feb 2017 08:29:59 AM
My earnest advice is not to. This can lead to reduced query minimization (and therefore less efficient queries).

Just my tuppence :)

MF.

I also use the database layer for relationships. If you do joins in the business layer and your model query subjects combine elements from multiple data source query subjects you will not get minimized SQL.

Invisi

Lynn, not Business Layer, but Logical Layer.
Few can be done on Cognos | RTFM for those who ask basic questions...

Invisi

MFGF, which query minimization can I expect, provided I don't snowflake? And you say it's good to create 4-7 (Framework) copies of the Date dimension in the Database Layer? That sounds like something for Logical Layer to me.
Few can be done on Cognos | RTFM for those who ask basic questions...

MFGF

Quote from: Invisi on 13 Feb 2017 08:09:03 AM
MFGF, which query minimization can I expect, provided I don't snowflake? And you say it's good to create 4-7 (Framework) copies of the Date dimension in the Database Layer? That sounds like something for Logical Layer to me.

If you are building a model over star schema tables, then it's a moot point, as you will have a one-to-one correlation between the data source query subjects in your database layer and the model query subjects in your logical layer. In this specific situation you can "get away with" putting the relationships in the logical layer without there being a difference in the queries generated vs if you put the relationships in the database layer. However, this is a very specific (and hugely desirable) scenario. You asked initially about "best practice". In almost every other modelling situation, you will have model query subjects in your logical layer that reference more than one data source query subject in the database layer. In this situation, it can make a big difference - if you put the relationships in the database layer, you end up with much better minimized (and therefore efficient) queries than if you put the relationships in the logical layer. Here's a recap:

- If you create a model over pure star schemas, you can put the relationships in either layer
- If you create a model over anything other than pure star schemas you should put the relationships in the database layer.

Since putting the relationships in the database layer works in both scenarios, best practice is always to do this, then you don't have to worry about it. :)

For "copies" of the date dimension, best practice is to create shortcuts to the date dimension data source query subject in the database layer, and join these to the desired fact query subjects based on the relevant key. Then in the logical layer, create new model query subjects over the shortcuts, and rename the items as desired in this layer. In modelling this way, there is only one data source query subject for the date dimension (and therefore only one place to update if it changes), so you reduce the modelling overhead of having to maintain multiple versions.

Cheers!

MF.
Meep!

Invisi

So what's the use of the logical layer then?
Few can be done on Cognos | RTFM for those who ask basic questions...

MFGF

Quote from: Invisi on 17 Aug 2017 04:33:52 AM
So what's the use of the logical layer then?

It's used to add calculations and filters, to re-arrange structures to make things more intuitive, to change names to be more friendly etc.

MF.
Meep!

Invisi

I don't see the point of 2 layers for that (Logical and Business) when relations are already in the database layer.

What is left to the business layer in this case?
Few can be done on Cognos | RTFM for those who ask basic questions...

MFGF

Quote from: Invisi on 17 Aug 2017 06:39:38 AM
I don't see the point of 2 layers for that (Logical and Business) when relations are already in the database layer.

What is left to the business layer in this case?

I don't follow? I'd have a database layer, and a logical/business layer. The database layer has the relationships. The logical/business layer has the calcs, filters, friendly names, friendly structures.

MF.
Meep!

satyajit.igate@gmail.com

MFGF explained nicely and very clearly. But you came to the initial level of thread again by putting the same question. Logical layer is for creating model query subjects using multiple query subjects from physical/DB layer.  :)

Michael75

I think the confusion arises from a problem of terminology. Logical & Business are two terms meaning the same thing - the second FM layer - just as Database and Foundation can both be used to describe the first FM layer.

Invisi

Quote from: MFGF on 17 Aug 2017 07:41:22 AM
I don't follow? I'd have a database layer, and a logical/business layer. The database layer has the relationships. The logical/business layer has the calcs, filters, friendly names, friendly structures.

MF.

If I read you here, you use only 2 layers; database and logical/business. Right?
Few can be done on Cognos | RTFM for those who ask basic questions...

MFGF

Quote from: Invisi on 21 Aug 2017 04:29:50 AM
If I read you here, you use only 2 layers; database and logical/business. Right?

Yes. You can add a third layer comprising star schema grouping shortcuts if you wish - this might make selections for packages easier. You wouldn't do any true modelling in this layer, though.

It's also common to have a fourth layer if you are creating DMR regular dimensions and measure dimensions :)

MF.
Meep!