If you are unable to create a new account, please email support@bspsoftware.com

 

News:

MetaManager - Administrative Tools for IBM Cognos
Pricing starting at $2,100
Download Now    Learn More

Main Menu

Modeling best practices approach -Your opinion requested

Started by owensct, 14 Jun 2012 09:31:33 PM

Previous topic - Next topic

owensct

Hi All,

Hoping those of you who have done modeling projects in the past can chime in and answer a question regarding the best approach to modeling systems.

We hired a consultant to assist us with modeling a number of our systems in Cognos 10. When the project started I requested that he model each individual system and then tie them together. Some of my reasons for wanting to him create the model in this manner were:

1. The ability to have models for individual systems up and functional for my user community sooner.

2. Ease of troubleshooting an individual model for a given system as opposed to trying to troubleshoot a massive enterprise model and figure out which system the problem is coming from.

3. Ease of maintenance. We explained that we would be maintaining the model internally once he was finished and wanted it to be easy to maintain.

The consultant said that that was not the best way to create models for our various systems and that doing so and tying them together would cause us to take a massive performance hit.

Several months into the project myself and my team second on our Cognos project finally got approval to take the IBM Cognos Administration and IBM Cognos Meta Data Modeling class.

While at the class without showing a preference either way we asked about the different modeling approach and the instructors in both classes stated that creating a different model for each system and tying them together was the best practice approach, and that a massive single enterprise model for all systems, was not considered a best practice. The reasons they gave for this were:

1. It is easier to build separate models for each system, verify that they work correctly and are providing accurate data before linking the models.

2. Separate models are easier to troubleshoot and it is easier to isolate any problems when you are dealing with a model from a single system as opposed to a single model that links to multiple systems.

3. It allows a company to follow the recommended approaches for implementing Business Intelligence in a organization of 1) getting something into the hands of your end users ASAP, so they can see an immediate benefit and ROI for the money spent on Business Intelligence, which provides a product that is usable now and not at some future date which keeps the customer invested in the project going forward; 2) Decreases the probability of a project that drags on through multiple years and budget cycles with no tangible, visible and customer usable results; 3) It allows a company to bring their data/systems online in smaller manageable pieces, both for the technical staff and end users.

4. A System based on multiple models is simpler to maintain and expand.

5. There is minimal, (if any) impact on performance or functionality in using a multi model approach and tying them together versus a single enterprise model approach.

The instructors went into greater details and discussed the matter at length. The end result was that we came home feeling that the approach we original requested for individual models for each system was the best approach. We also felt that there was no reason for the instructors to steer us down the primrose path as it were as they had no financial incentive for recommending this approach.

We haven't had a chance to meet with the consultant to ask him to justify his recommended approach in light of the information given to us by the instructors, (both of whom have years and years of experience with Cognos, as consultants, instructors and as employees of IT shops using Cognos). In a brief telephone conversation our CIO had with the consultant, he essentially said that he knows best and that the approach of a massive multi-system enterprise model is the preferred and accepted approach.

So my CIO is hearing one thing from the consultant and something different from the IBM instructors, myself and my team second. So I decided to see what the Cognos community had to say on the matter.

As always your input either way is much appreciated.

Cheers,

Gordon



blom0344

You keep refering to 'systems' .  I think the big issue is whether these are functionally different and have their own datastorage or just parts of the same application / DWH sharing the same storage.

Keeping things within 1 model has it's advantages and seems to work pretty well when it is all relational.  If you're talking DMR, then I experience performance matters with Framework manager itself (becoming sluggish)..

You also mention easier maintenance with submodels.  I very much doubt this.  There is no reason why 1 big model can't be transparent if you make a careful design.  It will certainly be easier to cope with metadata changes

1 model can sprout limitless packages for functional area's.  Shouldn't matter if you start with 1 model and use a good design. I find your logic very flawed at this point..

MFGF

I spent 10 years as an instructor working for Cognos, and I would never issue a blanket statement extolling one large model over multiple smaller ones, or vice versa. It really depends upon the requirements, the data and how it is stored. There are advantages to having one FM project - you end up modelling the metadata once, and this lends itself to the idea of conformity (eg you have one Time Dimension which is the same for all packages and is modelled only once). There are also advantages to having multiple smaller projects where the data is completely disparate with no overlap with other part of the business.

In the past, I have tended to steer towards the most easily maintainable approach, and often this has been to model everything in one project, but it's by no means a hard and fast rule.

Just my tuppence :)

MF.
Meep!

owensct

Thanks for the feedback, it's helpful.

In regards to systems, we are a hospital and we utilize a variety of different systems from different vendors. For example we have a system for patient and insurance billing which also tracks census, (how many beds are occupied versus empty), demographics and other bits and pieces. A financial system for the day to day hospital operations outside of patients, (GL, A/P, HR, Payroll, Income Reporting etc), then we have a system that tracks what are called FIM scores which shows us how well our hospital is doing in meeting national standards in comparison to other hospitals in regards to how successful treatment is for a patient when they arrive versus when they leave, we have another system that is used for scheduling nursing and Therapy staff and does something called acuity, which allows us to determine the number of staff hours that each patient needs, for example a critical patient may need two full time nurses in a shift, whereas say a patient with a broken leg may only need 1/2 FTE in a shift which in turn ties into costs associated with staffing, do we have available staff to meet the patient load and acuity or do we need to bring nurses in on call that may go into OT during that shift or do we have to bring in agency nurses, (which is more expensive) to meet patient needs? We are also in the middle of implementing an Electronic Medical Records system that will tie all clinical records together and reduce and eventually eliminate paper records for patients. This will allow us to have the patient charts, x-ray results, drug records etc all tied into a single interface.

The majority of these applications run in a client-server environment with MS-SQL for the backend, (separate servers). The exception is our financial system which runs on an IBM iSeries utilizing DB2 and our patient billing and Electronic Medical Records which run on a database system called cache .

Our reason for utilizing Cognos is to have the ABILITY to tie the data in these systems together to obtain meaningful BI. That doesn't mean that all users will need all data from all systems.

A simple example: we have a pretty good idea of what it costs to run a specific department in the hospital, say cardiology, so each department annually submits their anticipated budget for the upcoming year which is entered into our financial system for budgeting purposes. We can then estimate what our census needs to be on a month to month basis in order to meet operating costs for each month and for the year as a whole.

So a management dashboard that shows what our current census is (real time) from our patient billing system and can then pull the budget data from our financial system and compare it against our real time census to show us if we are on target, under target or over target. It also would have the ability to drill down so that we can see census for each unit of the hospital versus budgeted. A further level of function is to interface it with our staffing system to see what staffing levels/costs are for each unit and how it compares to requested budgets and allows us to see if the staffing costs for a particular unit are abnormally high and why, is it due to high census, high acuity patients or are supervisors not doing due diligence and bringing in OT or agency nurses when there are other staff available?

In other instances departments would be interested in pulling data from a single backend system, such as our staffing system or our patient billing system.  Say we are considering opening a new clinic on one of the other islands, we can use data in the patient billing system to show us what island and where on that island the majority of our patients come from, and this data would only come from one system. Or we may want to do a comparison of hospital renovations actual versus budgeted and the ability to drill down to various phases of a renovation to see actual versus budgeted for that specific phase.

So to make a even longer story short, all of the various systems we use will need to be tied into Cognos as some of our needs will involve data from two or more systems whereas, some staff will never pull data from more than one system. The example systems I listed above are only a few of 8 different systems we want to use with Cognos, so this is obviously a multi-year project and having the ability for the hospital to begin using aspects of Cognos sooner rather than later is important to the continuation of the project and to ensure management that Cognos is and will continue to be money well spent.

Hopefully this provides some additional clarification.

Gordon   

blom0344

I would really, really advice you to spend some time on whether this will work out in the end.  I completely miss any reference to a Datawarehouse solution in your situation.  You might end up with solution with multiple versions of the truth, having reports that seems to contradict other ones, or gaps you cannot account for. I seriously doubt you pull this of just by the power of Cognos' metadata modelling strength.  It is good and flexible but it also has it's limits.  We have similar issues with too complex metadata models with all the possible drawbacks (performance, maintenance etc)
In my country most hospitals implement DWH solutions at some point to cope with the information demands. Some develop their own, others purchase  a standard solution..

Lynn

I agree there ought to be some serious discussions going on about data warehousing. In all likelihood you may find master data management issues cropping up when attempting to report across multiple systems. Perhaps multiple systems deal with patient information where the same patient is referenced by different identifies. Neither a single model nor a multiple model approach will resolve that.

Also, at last year's IOD conference there was an excellent presentation by a fellow at Boeing about a single model approach. I think in your current multi-system environment you'd want to assess conformed dimensions as the biggest factor in making a decision about which way to go. Has segmenting and linking come up at all? Perhaps there is some middle ground there.

wyconian

Hi thought I would throw in my opinion too :)

One of the issues you're going to come up against is that developers can rarely agree on the best approach, the problem is there is almost always more than one way of achieving something, a lot of time it comes down to the developers personal preferences.

There are a couple of things you should bear in mind
1) As Blom said FM is pretty powerful but it is only a metadata modelling tool.  One recommendation would be to push as much as possible back into a data warehouse.  This gives you greater abilities to 'merge/clean' data from different sources so you can get to a consensus on the one version of the truth (e.g. if patients can be identified differently in differnt systems).  In a warehouse you can create some kind of mapping routine to deal wth these kind of situations. If you did that in FM the query to do it would be passed back to the database(s) everytime you run a report.  So it soounds like a warehouse is almost certainly a must for you.

2) as far as models go my preferrred approach (and it is pretty much a personal preference) is to have a single enterprise model which has segments for individual systems.  That's kind of compromise as you get the benefits of having a single FM model (e.g. one model that has all your data structure) and the benefits of having separate 'sub models' such as being able to trouble shoot smaller projects, deploy the model to your users faster, not having to take down the whole model when you need to change just a small part of it. 

3) If you have multiple developes working on different dev streams separate models let each developer work on their model without impacting each other, the separate models can then be segments in the 'master' model.

Having a single master model also lets you reuse query subjects such as the date/time dimension.  Something I normally do is have a Generic Tables folder which holds calendars and other tables/DQS which can be used by more than one segment of the model.

The only performance issue I've come up against is when opening the master model.  If it is massively huge (as it can quickly become) it can take a long time to open or to verify.  I've also had issues with a master model getting corrupted (in C84), I think because it was so big, but it did have nearly 150K items.

There shouldn't be issues in terms of query performance as the FM model is just metadata, though you may have issues if you are trying to do anything overly complicated in the model (like mapping items from different systems together).  I'd suggest anything overly complicated should be pushed back into the ETL.

You should also think about how you are going to report against data from different systems.  From what you've said, and I have to say this is one of the most comprehensive posts I've see:), it sounds like you would benefit from different datamarts for each source system/application.  The question is then how do you report across different data marts?  It's very possible but you will potentially need to look resolving issues from multiple fact queries (look at determinants etc).

As I said there isn't always a single best approach and it comes down to developer preferrence.  There is definatley not a blanket solution.

Maybe you should start by deciding which warehousing methodology you think is most suitable for your users and development team.  Broadly speaking there are 2 approaches Kimball and Inmon.  I tend to favour Kimball (because it's what I was taught).  Generally speaking that would be a datamart per source application/subject area with some form of conformed dimension(s) joining them together.  Inmon takes a slightly different approach (sorry can't remember off the top of my head what it is lol).

Final thoughts (as I seem to be going off on one); Seriously consider a warehouse as a first stage, if you get that right and take care of the majority of modelling in the warehouse the FM model should be relatively simple; I would suggest developing 1 system at a time (or 1 system per dev stream) rather than taking a run at the whole thing - smaller and simple is nice :-)

Good luck, sounds like an interesting project

cognostechie

I have done FM Modelling to read data from source systems (without haveing a Data Warehouse) so I thought I would put in my thoughts.

Though technically possible to make FM work with source systems, it is a case of short-sightnedness which reuslts in massive repurcussions in the future. The system becomes such a high maintenance that the money saved by not making a Data Warehouse is much less compared to what is spent on creation and maintenance of the reports every year. The performance takes a big hit for sure because there is no way FM can work with joining data across multiple databases. No single database does the processing and it required local processing in lot of cases. Result - One report is slow and that causes another one to be slow because processes are not available till the 1st report finishes executing. Since it is also not possible to resolve all joins from the source systems, the developer ends up creating joins and unions in the reports thereby making it very high maintenance not only for the time taken to create it but also to make a small change in the future.

Some companies made cubes which can collect data from various systems but at some point, you also need the detailed data from a relational model.

I would recommend creating a Data Warehouse or creating Data Marts rolling up to a DW and then make one FM Model instead of making seperate models. FM allows different namespaces and folders to organize it.

owensct

All,

Thanks for the replies, this is good information. Sorry for the late reply, I've been running like crazy lately.

The use of a data warehouse has occurred to us, and we will most likely use that approach for any data that is relatively static and for historical data, but some of the data we will be accessing has to be real-time, census and staffing for example, so that decisions can be made in real-time that will have an immediate effect on budget, such as census is low do we send some of the nursing and therapy staff home? And why is unit x using agency nurses when the census is low? Or Census is high, do we bring in on call nurses, ask staff if they want to work a double shift, bring in agency nurses etc. All of which can have a significant impact on budget if you're a small hospital like we are. For a really big organization like Stanford or John Hopkins it might not have that much of an impact, but for us it is discernible.

So for some systems a data warehouse is not viable as one of the things we want to use Cognos for is to be able to make decisions on current situations, not just look back and say, "ahh, next time this situation arises we should try doing it this way". The medical industry is becoming very competitive, and new regulations are sucking the life blood out of small hospitals. Here in Hawaii we also have a finite market as we are one of the most isolated island chains in the world, so people don't just decide to hop a plane to come here, as opposed to someone who doesn't want to go to hospital xyz in Fresno, California for a specific treatment can get their doctor to send them to LA or San Francisco. Our hope for using a product like Cognos is to allow us to run mean and lean in real-time as opposed to looking at history and say ok to run mean and lean we need to do this next time.

Do I have a correct view on the data warehouse concept not supporting real-time effectively or am I missing something?

Thanks

Gordon

wyconian

Hi

datawarehousing can cope with real-time reporting but (I think I'm right in saying) it's a relatively new concept.  There are some products coming out on the market which can deal with realtime reporting but I don't think Cognos is quite there yet.  I hate to say this butt you could look at some (non cognos) products like OBIEE etc which can deal with this.

I think it is possible to use cognos for near realtime reporting (i.e. running the ETL batch a number of times during the day).  The essence of this is being able to set up delta loads where you only load data which has changed or is new based on criteria you can specify.  For example your source data may have something like modified date so you could only load data that has a modified date > the maximum modified date in the destination table.  I've used this to reduce the work load in overnight batch runs and currently use it to run the batch every 2 hours during peak times.

It all comes down to being able to identify new or amended records from the source and then appending or updating these during your batch run.

Take a look at your source applications and see if you can find some way of identifying these records.

Good luck :-)


Lynn

Cognos offers an RTM solution that is integrated in C10. It used to be called "Cognos Now!" and has been marketed a few different ways since they acquired that product. I worked with it shortly after they acquired it, but haven't had any recent exposure to it. http://www-01.ibm.com/software/analytics/business-intelligence/real-time-monitoring.html

Predictive analytics is another area to consider. Everyone who builds a warehouse wants to do more than just look in the rear view mirror. http://www-01.ibm.com/software/data/cognos/solutions/spss/

From a data warehousing perspective, I agree with wyconian that understanding your reporting requirements is important to determine the best ETL strategies. You might also consider the concept of drill through. When doing some analysis against the warehouse a user may drill through to the detail data in the source system to get a targeted set of information that is up to the minute.

Of course Cognos can be modeled to report on data from different systems without a data warehouse. Any integration of data from different sources is locally processed by Cognos so it isn't necessarily the best performing solution in town, especially for large data volumes. You may have situations where some operational reporting is best served directly from an operational system without need to integrate data from elsewhere. Having a data warehouse doesn't preclude you from sourcing reports in other ways. A clear understanding of reporting and analysis requirements is really helpful in figuring out which approaches are best suited for each.

I know I mentioned this in my previous post, but either approach may be impacted by master data management issues in the organization. What are the key elements that connect data from the disparate systems and are they consistent across those systems? It sounds like there may be multiple systems containing staff data and patient data. If these need to be integrated to satisfy reporting requirements it would be worth understanding if you have any challenges or not.

Good luck!

bdbits

More-or-less FYI, but there are real-time data warehousing solutions available in the market now. They are not cheap (100k++), and/or require lots of IT time depending on product. Oracle's GoldenGate and XStreams products come to mind, and IBM Infosphere CDC (change data capture), but there are others.