Hi there
I would be interested in some hints / best practice for the following use case
We use Cognos for our project planning. People can change their data as needed, an look at the result every morning, after the ETL job
Up to now it was enough, so I used TRUNCATE to load data.
Now management would like to be able to have history, snapshot once a month. Easiest way would be to use APPEND. But that would generate a huge amount of data, and 90% waste.
I ended up with several solution, but not sure what is best or state of the art
1) create a 2nd table with the monthly data. Negative: create two FM package, duplicate each report
2) create a 2nd database with the monthly table so i can use the multiple connection feature.
3) mark the data as DoNotDelete once a month, adn inplement a SQL job to delete all other Data
4)...
Thank you for whatever hint you can provide
erwin
Is your data source transactional, a data warehouse, a cube?
What do they want to do with the historical data? Will there be different reports (with trending perhaps), or they want to run the same reports with an option to select a certain time period?
Hi,
thank you for your answer..or rather questions :)
You're right. I try to be short and concise...but at the end there is missing information
I've a DRM, build using Datamanager. Part of the data put together is for budget planning.
Project and Portfolio Manager have to deliver their needs per month for next BY till a given date. At this date the data will be frozen.
Only in case of budget adjustment will a new snapshot be made
Need with prio top down
- freeze the data at a given time and report on this data set - same reports than for daily data [plan per month]
also same report, but for a selectable snapshot
- diff between daily and last snapshot
- ( diff between 2 snapshots )
- (( trending ))
hope it's a bit clearer
If it is an option, I think the best approach is to rethink the data warehouse a bit. I would look at adding a new attribute, call it "Reporting Period". This would be the month to which the facts apply, or put another way the month the numbers become "frozen". Then in your reports, you add a new parameter to select which of these months are the numbers to be included. This will also allow you to compare periods efficiently and however you like within a single report.
Does this make sense? Is this a viable option?
It is an option since i do also all Datamanagrement stuff
This would request and Update/Insert .. what would overwrite every day till the date is in the new period, where it would insert
Sound good, but does not work in my case because deletion of plan data are not marked as deleted in the source, but just gone.
A solution would be to run a SQL Job to delete the current period and use always append. End up similar to my variant 3)
Is that what you thought about?
erwin
It does sound pretty similar. Just to clarify what I was saying, I saw the process as something like this.
* For the sake of discussion, assume you want to report updated numbers each day throughout a period.
* For current numbers, "reporting period" would be null to indicate that it is "current" data.
* When the period is over and you want to save current numbers for historical purposes, remove all data where "reporting period" is null, populate the numbers to be saved and set "reporting period". Since it is no longer null, this is now "historical".
* You can keep the historical data indefinitely, if you want, for trend analysis.
* If you want to get rid of data for a particular reporting period, it is a simple SQL delete based on the "reporting period".
Thank you bdbits
workflow is fine. need to find the best way to run the single steps.
last question: is there another way than a SQL Job in Datamanager to <remove all data where "reporting period" is null>
I am not very fluent in Datamanager. As far as I know that is the way to do it.
Quote from: erwink on 11 May 2016 01:16:16 AM
Thank you bdbits
workflow is fine. need to find the best way to run the single steps.
last question: is there another way than a SQL Job in Datamanager to <remove all data where "reporting period" is null>
There are lots of tools you could use for this - Data Manager is just one. I'm guessing Data Manager is your ETL tool? Is there a reason you want to use something different?
MF.
Hi MF
Yes Datamanager is my ETL tool.
I was asking for another way in the tool than with a SQL element in a Job. E.g integrated in the Fact Build
But I'm just setting it up and looks quite easy..so probably no need for another solution
Thank you
Quote from: erwink on 12 May 2016 09:13:33 AM
Hi MF
Yes Datamanager is my ETL tool.
I was asking for another way in the tool than with a SQL element in a Job. E.g integrated in the Fact Build
But I'm just setting it up and looks quite easy..so probably no need for another solution
Thank you
Have you looked at using UPDATE/INSERT behaviour as the refresh type in your fact build? You'd need to make sure you have the appropriate key definitions set to target specific rows to update, but that isn't hard.
Cheers!
MF.
Hi MF
Yes, sure I looked for UPDATE/INSERT . It was actually my first setup...till I found out that deleted entries in the source DB will not be recognized as such and will stay forever. So I changed to TRUNCATE what worked fine till now
thx erwin
If your database supports a MERGE statement you can use that to manage this type of insert/update/delete logic.
This article explains it in the realm of DB2 but I know other vendors have this implemented also. It is an ANSI standard.
https://www.ibm.com/developerworks/community/blogs/SQLTips4DB2LUW/entry/merge?lang=en