If you are unable to create a new account, please email support@bspsoftware.com

 

News:

MetaManager - Administrative Tools for IBM Cognos
Pricing starting at $2,100
Download Now    Learn More

Main Menu

FM Question

Started by ghostraider, 03 Nov 2008 06:18:45 PM

Previous topic - Next topic

ghostraider

Hi All, I have three tables Vendor, Serial and Vendor Account. Vendor is left outer joined to both Serial and Vendor Account. In impromptu when i query from these three tables for a list report I get the below SQL.

select T1."ISSUE_ID" "c1", T1."VENDOR_ID" "c2", T2."VENDOR_TYPE" "c3", T2."VENDOR_CODE" "c4", T3."VENDOR_ID" "c5"
from ("DB"."SERIAL_CLAIM" T1 LEFT OUTER JOIN "DB"."VENDOR" T2 on T1."VENDOR_ID"=T2."VENDOR_ID") LEFT OUTER JOIN "DB"."VENDOR_ACCOUNT" T3 on T1."VENDOR_ID"=T3."VENDOR_ID"

However in Cognos i get the stitched query with two individual Select statements. How can i have Framework to generate SQL similar to Impromptu? Is star schema modeling the only option or is there any other way? Recently i have seen a third party package which generates SQL just like Impromptu.
But i am not able to figure out how the modeling has been done and i don't have access to the model. Please let me know how to go about this issue.

Thanks much for your time.

blom0344

You can 'trick' FM / Cognos 8 in NOT generating a stitch query by changing the cardinality of joins (even though you are aware that you indicate a 'wrong' cardinality). I'm not saying that this is proper modelling (as it is not)
The alternative is to define a custom SQL object that mimics the Impromptu SQL.

Once again, these are not preferred tactics, they are just 'possible' remedies..

wyconian

Sounds like you've got an issue with the model  but as you don't have access to it there is another way but it's a bit of a pain in the neck.

Assuming you're using report studio; create 3 separate queries that bring back the data items you want from the 3 tables (along with something you can join them on e.g. vendor_id.

You can then join the tables together any way you want in the queries page of your report (although you can only join 2 queries at a time but it can be done).

The list on the report page should point to the final query where you have everything joined together.

like I said not an elegant solution and you should really get whoever designed the model to resolve the issue

ghostraider

Thanks blom0344 and wyconian, appreciate your responses. Wyconian, if i am going to use queries to join at the report level doesn't it take a long time for the joins to be performed? And since the queries have to bring in data individually and join them on the cognos server won't my temp space get filled if the report is huge? We are currently doing these kind of joins but my temp space gets filled using about 10gb of space and then getting emptied in no time once the report is run.

Blom 0344, if i have to trick FM i think it has to be a inner join 1..n where i can change it to 1..1 so that there are no stitched queries in the report when more than one 1..n tables are used. But in the case of 0..1 or 0..n i am not sure how to trick or avoid the full outer joins on Cognos SQL.

blom0344

Quote from: ghostraider on 04 Nov 2008 11:05:50 PM

Blom 0344, if i have to trick FM i think it has to be a inner join 1..n where i can change it to 1..1 so that there are no stitched queries in the report when more than one 1..n tables are used. But in the case of 0..1 or 0..n i am not sure how to trick or avoid the full outer joins on Cognos SQL.

When designing joins in FM you also specify the cardinality. Cognos assumes that the n side of the  1:n is the fact and if 2 facts converge on a dimension  (n:1:n) then Cognos will try to resolve the situation by splitting the query into 2 seperate ones and rejoining them server-wise (stitch query)
This involves  1. full outer join  and  2. coalesce   
However, if you are in a hurry and need this on a temp basis you may just as well write the SQL involved and place it in a SQL object

Vikram_Mitta

Hi,

I am Facing Similar kind of problem, where the Queries are getting separated and generating a Stitched Query, luckily i have access to FM model.
Can anyone suggest me possible solution for Stitched Query.

Thanks,
Viky

wyconian

Yeah you may well run out of temp space but you could try adding a filter to the queries before you join them.  This is a 'dirty fix' as is adding the a sql statement to the report but may provide a quick and nasty solution.

Obviously the best place to fix this issue is in the FM model or etl.

Sounds to me like you have a number of issues with the model e.g. I would prefer not to see any outer joins.  I would suggest that the etl needs to be updated so that all dimensions have a default static member that the fact would join to instead of having to have an outer join.

I would suggest that you need to look at the model and the cardinality, joining facts together is a bad idea (even dimensions that fm treats as a fact i.e. where all the incoming cardinalities are n), you may need to look at the data in the dimensions so that you can legitimately change the cardinality so the dimensions act as dimensions, it may be possible to do this by combining some dimensions or adding a bridging table.  It's quite a bit of work but it can be done in FM but would be better in the etl.

Good luck

ghostraider

Thanks much Wyconian and Blom0344 for your inputs.

I have about 250 tables from the relational database. They are right now modeled based on ER diagram with no Star Schema method. I have the import layer where i have all the tables from the database as they are. Joins are created in this layer based on the ER diagram, however there are lot of loops. For example Component table is Component Pattern, Component is joined with Claims and Component is also joined with Claims. Thus forming the loop A=B=C. I can create a shortcut for Claims and break the loop but i have more tables that are joined to Claims. For example, i have Component Routing which is also joined to Component and Claims table. So should i create another shortcut for Claim table to make this join? I have several loops like this and i think the model will be quite huge if i continue to add shortcuts.

Assuming that these are the only tables in the model, how do i create a star schema model? Please let me know. Thank You!!

wyconian

Hi

Ideally the star schema should be based around a fact table i.e. a table with something you want to add etc e.g. profit, sales etc.  Then you would join the dimension tables to the fact.  At the moment I'm working on a warehouse that doesn't have any facts so I've had to create a factless fact table to join the dimensions to (it is basically a table full of keys).  The thing to bear in mind here is that C8 will only treat a table as a fact if all the joins into it are 1:n.

Personally I don't like using shortcuts, I would rather use alias tables ti break the loops.  I'd suggest that you have an intermediate layer so that would give you a source layer which is just direct copies of the tables in your warehouse without any joins, an intermediate or enterprise layer where you can have as many aliases of tables as you need to break the loops and with derivations, joins etc and then a nice user friendly presentation layer which the users will report from.

You're right that your model may become very big in order to break the joins but that's quite normal and shouldn't be a problem though of course the more tables you have the more complicated maintenance will be.

250 tables is a lot to have in one model.  Can they be broken down into query subject areas?  If they can you can have a number of smaller packages organised in folders which will make it easier for your users to report against.

blom0344

Quote from: ghostraider on 12 Nov 2008 09:58:17 PM
Thanks much Wyconian and Blom0344 for your inputs.

I have about 250 tables from the relational database. They are right now modeled based on ER diagram with no Star Schema method. I have the import layer where i have all the tables from the database as they are. Joins are created in this layer based on the ER diagram, however there are lot of loops. For example Component table is Component Pattern, Component is joined with Claims and Component is also joined with Claims. Thus forming the loop A=B=C. I can create a shortcut for Claims and break the loop but i have more tables that are joined to Claims. For example, i have Component Routing which is also joined to Component and Claims table. So should i create another shortcut for Claim table to make this join? I have several loops like this and i think the model will be quite huge if i continue to add shortcuts.

Assuming that these are the only tables in the model, how do i create a star schema model? Please let me know. Thank You!!

ghost,

I think that the collective experience with Cognos 8 Modelling is that it is less permissive with regard to truly relational models (as opposed to say older tools like Cognos 7 / Busobj) You are really expected to create models that involve starschema's (though in some cases snowflakes will work if you abstain from multifact queries)
We work with less than ideal models as well and I have witnessed issues that never occured with older tools. Like adding a single table to a model and having a totally different SQL generated for a existing report that had no involvement at all with the addition.
The key lies in isolating which relational tables may be regarded as storing either facts / transactions and building your submodels around them.  For a relational model this may mean that the same table could play a different role between models, hence a proliferation of either shortcuts/aliases.
My experience is that aliases keep the overall model a bit more transparent than shortcuts, so I second Wyconian there..

ghostraider

Wyconian and Blom,

I just checked my Impromptu Catalog which is supposed to be working right before we moved to Cognos 8.2. I checked the joins in Impromptu and when i analyzed them Impromptu tells me there are numerous loops. The general idea when there are so many loops is Impromptu doesn't know which path to take, however our Catalog with 250 tables seems to work inspite of all these loops. I am not sure how.

Wyconian, I have asked my users if they would like packages based on their business area, but they think they might use any table out of the 250 and hence need only one package.

Blom, i was talking to my friend who works on Business Objects and he tells me there is a concept called 'context' which can be used to resolve loops instead of using aliases or shortcuts. Do you think is there something like that in Impromptu as well? I have checked the Impromptu documentation and didn't find any hints.

blom0344

Quote from: ghostraider on 14 Nov 2008 10:34:44 AM

Blom, i was talking to my friend who works on Business Objects and he tells me there is a concept called 'context' which can be used to resolve loops instead of using aliases or shortcuts. Do you think is there something like that in Impromptu as well? I have checked the Impromptu documentation and didn't find any hints.

Business Objects uses the context mechanism to be able to cope with multifact queries. It can resolve the many-to-one-to-many issues, but let me give you a nice link (also for your friend) to an eloquent treatment of the issue:

http://www.dagira.com/2008/03/10/what-is-a-context-anyway/#more-70

The multifact issue is something that is extensively covered in Cognos8, but I have my suspicions about whether Impromptu was that sophisticated. I have always imagined it as inferior to the BO client in that respect. My experience with Impromptu is too limited though..

[being able to cope with multi-facts is one of the properties that distinguishes a premium BI tool from less sophisticated tools]

wyconian

Hi

I would still suggest that trying to build a model based on 250 tables is going to be difficult to say the least.  Is there anyway of getting a better list of requirements from your users?  For example could the requirements be divided into departments etc?

A quick word on Impromptu.  As it the catalogs were based on a relational database impromptu would only try to create data paths based on the tables you used in the query so it didn't matter too much that the catalog had loops.  You could also add weightings to the tables which would suggest the best path.

As far as I’m aware you can’t do that in FM and have to come up with a way to resolve the loops.

ghostraider

Blom and Wyconian, thanks much for your time.

I have convinced my users about the current model not being efficient enough and how it has various join problems. I have proposed a solution of breaking down the model into smaller ones based on the functionality of the system and publish packages accordingly. The total number of tables in the Impromptu Catalog is 300 and it works very well and my job is to make a working model in Cognos 8.2. So how do i model these 300 tables from the OLTP system? Even if i am going to break the model based on functionality of the system i would have about 150 tables in one of the models. How should i go about this? Should i build a data mart using star schema methodology so that i have all the business functionality in one package? The con here is my users want to report on live data. There is a high chance they won't agree to it.

Below are the first couple of steps i would be taking if i have to model based on OLTP.

1) Import the required tables from the data source into the Physical Layer Namespace
2) Create joins based on the ER diagram or as per user requirements.
3) Create a new Namespace and call it Logical Layer.
4) Can i merge query subjects having the same business idea in this layer? (Ex: Vendor, Vendor Address, Vendor Account). Can these three tables be merged and once they are merged ? Should i create virtual star schemas here? If i do create virtual star schemas how about surrogate keys for the dimensions? should i just join dimensions and facts based on natural keys? And should i create a time dimension on my database?  How should i go about this? Please let me know.

Thanks a lot!!




wyconian

Hi

Sounds like you've got a plan, basing the packages on business functionality would definately be the way I would approach things (at least to begin with).

The big question here is are you using data manager or framework manager?

If you use data manager you can create a number of star schemas (possibly one for each business area) based on a fact table and joining to a number of dimensions.  You can merge tables as you need (e.g. vendor, vendor address, vendor account etc could be merged into a vendor dimension)so will end up with a much smaller number of dimension tables than there are in the source (Kimball would suggest 10-15 dimensions for each fact but you can have as many as you need), you can also build hierarchies which will give your users the ability to drill up/down.  You can have conformed dimensions which are dimension tables that can be used by a number of fact tables so if you need to you can report across the business areas.

DM will generate surrogate keys for you and depending on what kind of build you have you can append new data items to the tables rather than rebuilding them all the time.

The downside to this approach is that the data won't be live but you can schedule the job streams to run every night so the data will be correct as at the end of the previous business day.  If your users insist on having live data then this probably won't work.

If you only have access to framework manager, you can do pretty much the same kind of thing but as you're only generating metadata rather than populating database tables you may have a performance hit if you try to create dimension tables from relational db tables.

My suggestion would be create a number of conformed data marts in datamanager it will probably take longer to develop but there should be definate performance benefits, it will also ensure that the reports are predictable (i.e. you won't get 2 users using slightly different queries to get similar but different results).

Seeing as I'm an ETL developer I could be bias about this.

Sounds like you've got a lot of work ahead of you though.

ghostraider

Wyconian, we do not have Data Manager in our environment. So i will have to do everything using Framework Manager only. And yes i do believe there is going to be a performance impact and also a lot of work for me to get this model right. Many thanks for your quality inputs.


blom0344

If the requirement is that you NEED to be able to report on live data, then you would need to implement a 'near-real-time' DWH solution. This is quite a bit harder to implement as you can have synchronizations issues between loading new facts and related attributes.
This depends on the way the source system (application) is going to process the data and the way users seek to feed the transactions.
The quick and cheapo solution is to use Cognos modelling to create virtual datamarts and accept that performance may be less than ideal. Again, this depends on the amount of data and complexity of the model.