If you are unable to create a new account, please email support@bspsoftware.com

 

News:

MetaManager - Administrative Tools for IBM Cognos
Pricing starting at $2,100
Download Now    Learn More

Main Menu

Unique aggregate requirement

Started by raghuram, 18 Sep 2013 12:42:19 AM

Previous topic - Next topic

raghuram

I have another challenge that I am trying to resolve but unable to get the solution yet. Here is the scenario. Pardon the formatting if it messes up at the time of posting.

ACCT_NUM  CERT_ID    Code      Date      Desired Output
      A                   1         10      1/1/2007    1/1/2008
      A                   1         10      1/1/2008    1/1/2008
      A                   1         20      1/1/2009    1/1/2010
      A                   1         20      1/1/2010    1/1/2010
      A                   1         10      1/1/2011    1/1/2012
      A                   1         10      1/1/2012    1/1/2012

      A                   2         20      1/1/2007    1/1/2008
      A                   2         20      1/1/2008    1/1/2008
      A                   2         10      1/1/2009    1/1/2010
      A                   2         10      1/1/2010    1/1/2010
      A                   2         30      1/1/2011    1/1/2011
      A                   2         10      1/1/2012    1/1/2013
      A                   2         10      1/1/2013    1/1/2013

As you can see, I need to do a MAX on the date based on each group of code values (apart from ACCT_NUM and CERT_ID) before the value changes. If the same value repeats, I need to a MAX of the data again for that group separately. For example, for CERT_ID of '1', I cannot group all four rows of Code 10 to get a MAX date of 1/1/2012. I need to get the MAX for the first two rows and then another MAX for the next two rows separately since there is another code in between. I am trying to accomplish this in Cognos Framework Manager.

Gurus, please advise.

blom0344

'another code in between' .  There is no such thing, my friend.  A database does not store rows like a spreadsheet. Any concept of ordering is enforced by the way you query data from a table.  (The only internal ordering would be invoked by a clustered index) What you are trying to do goes against all set-based logic.
I think you need to process the data within your ETL streaming it based on ordering by ACCT_NUM,CERT_ID,DATE adding an index when the value of code changes

wyconian

Hi

I'm not completely sure what you're trying to do but it sounds like you could try using some analytical functions in report studio

Something like
MAXIMUM(column FOR column1, column2)
you can have different combinations on the FOR side to get the grouping you need.  This will repeat the max value of the column within the specified grouping for each row.

Hope that helps

Good luck :-)

blom0344

Going for difference in grouping is not going to work, cause any type of grouping will effectively return rows  1,2 and 5,6 as one group. Set-based logic does not take into account that data is stored in an ordered fashion.  If the data contains an  ascending meaningless key, then it will be possible to process in the order of that key and most ETL tools will have some solution for comparing a current value with a previous one . But even then it will take quite some doing to establish an index to create a subgroup column.

raghuram

Thanks for the replies.

Hi Blom,

I did not mention earlier but this is a relational data source. Also, the info does not go through ETL process. I am writing my own SQL to join tables for reports (each report has different join conditions between same tables) and creating one (or two) query subject(s) per report in Framework Manager.

The order is set by the specifications. The ascending order of date is required to depict the chronological order for transactions for each ACCT_NUM + CERT_ID. The amount columns are to be aggregated for each set of codes that appear consecutively. The report output needs to look like this.

ACCT_NUM  CERT_ID    Code      Date Paid
      A                   1         10      1/1/2008
      A                   1         20      1/1/2010
      A                   1         10      1/1/2012

      A                   2         20      1/1/2008
      A                   2         10      1/1/2010
      A                   2         30      1/1/2011
      A                   2         10      1/1/2013

The amount columns are automatically aggregated to show the TOTAL for that set of code accordingly.

Hi Wyconian,

I am not getting what I want through regular aggregate functions. I need to do the MAX on Date in some way. If I do MAX(Date For ACCT_NUM + CERT_ID + Code), then I will get only one MAX(Date) for CERT_ID '1' and Code '10' which would be 1/1/2012. I won't get 1/1/2008. Similar is the case with CERT_ID '2' as well. I am thinking that some order of RANK, DENSE_RANK, ROW_NUMBER might work but no luck so far.

Any suggestions or ideas are greatly appreciated.

blom0344

Raghuram,

Give me the query that generates the output as in your original example.  :o   The order you show may be possible by writing a very complex case within the order by (literally taking care of all sets), an aggregate will NOT take into account the line up of data. The nature of data stored in relational databases is  by definition non-ordered. Rank,Dense_rank and row_number won't help you out

raghuram

Hi Blom,

The data I posted is the pseudo data for obvious reasons. Here is the SQL to create the table and populate with the values I have in the post.

Create Table DESIRED_OUTPUT
(
   ACCT_NUM char(1)
   , CERT_ID tinyint
   , Code tinyint
   , Date date
)

INSERT INTO DESIRED_OUTPUT
   (
    ACCT_NUM
   , CERT_ID
   , Code
   , Date
   )
VALUES
    ('A',1, 10, '1/1/2007')
   ,('A',1, 10, '1/1/2008')
   ,('A',1, 20, '1/1/2009')
   ,('A',1, 20, '1/1/2010')
   ,('A',1, 10, '1/1/2011')
   ,('A',1, 10, '1/1/2012')
   ,('A',2, 20, '1/1/2007')
   ,('A',2, 20, '1/1/2008')
   ,('A',2, 10, '1/1/2009')
   ,('A',2, 10, '1/1/2010')
   ,('A',2, 30, '1/1/2011')
   ,('A',2, 10, '1/1/2012')
   ,('A',2, 10, '1/1/2013') ;

Please remember that I need to use ROW_NUMBER or equivalent in Cognos to set the Date in Ascending order first (for ACCT_NUM + CERT_ID) before making further manipulations. Thank you for trying this and helping me out.

blom0344

This is really a taxing post. I feel none of my posts attribute to further insight..
Inserting data into a table has zero relationship with how data is stored. The rownumber() only has meaning in HOW you query your data, NOT about storage.  Either I cannot convey the concept ( as english is not my native language) or you are not picking up the essence. Truly hope , someone else manages to explain set-based logic / relational databases. Frankly, I give up   :-\

raghuram

I am not sure why it is confusing.

When did I imply that the data stored is in a set order in the database tables? I explained in my previous post that I need to apply ROW_NUMBER() to set the data queried from database in ascending order of date and THEN do the manipulations to get the output in the report as I want.

I will try to make it simpler (hopefully).

Irrespective of how the data is in the database table, let us assume that I apply the ROW_NUMBER() OVER (PARTITION BY ACCT_NUM, CERT_ID ORDER BY Date ASC) and then get the following (my original post without the Desired Output column). I am also including an amount column here for more clarity.

ACCT_NUM  CERT_ID    Code      Date       ROW_NUM     Amount
      A                   1         10      1/1/2007       1                    100
      A                   1         10      1/1/2008       2                    200
      A                   1         20      1/1/2009       3                    200
      A                   1         20      1/1/2010       4                    300
      A                   1         10      1/1/2011       5                    200
      A                   1         10      1/1/2012       6                    200

      A                   2         20      1/1/2007       1                    100
      A                   2         20      1/1/2008       2                    300
      A                   2         10      1/1/2009       3                    150
      A                   2         10      1/1/2010       4                    200
      A                   2         30      1/1/2011       5                    700
      A                   2         10      1/1/2012       6                    300
      A                   2         10      1/1/2013       7                    350



AFTER THIS, I need help in getting to the following output. Again, I am including the aggregated amounts in the last column.

ACCT_NUM  CERT_ID    Code      Date Paid      Amount Paid
      A                   1         10         1/1/2008          300
      A                   1         20         1/1/2010          500
      A                   1         10         1/1/2012          400

      A                   2         20         1/1/2008          400
      A                   2         10         1/1/2010          350
      A                   2         30         1/1/2011          700
      A                   2         10         1/1/2013          650

Please let me know if you have any suggestions/ideas. Thank you

blom0344

I understand your requirement fully, but adding row_number does not add a handle to generate the required set you need.
It does not matter how you line up the data, an aggregation works against sets.
For the first 6 rows in your example there are either  6  sets (distinguished by the added row_number )  or 2 sets (distinguished by Code). The order of the data has no relevance in set-based logic.
Well, to be honest , I have had these discussions many times, most often with people writing some sort of programming code. They are used to lining-up data and looping through it   ;) 

bdbits

Based on what you said about how you are using Cognos (query subject per report), you will probably want to create a stored procedure where you can write code to do whatever you want. Then base your report off of that.

I am not really seeing a way to do selects or set-based operations to do what you want.

I have to say the approach here seems to me a sub-optimal use of Cognos, though. Not sure why Cognos is being used at all, actually, when much simpler (and cheaper) reporting tools can be used to read stored proc outputs or let you write direct SQL for every report you want to create. Cognos works much better with a seperate data warehouse where you can properly structure your data and populate it from the source with ETL tools.

raghuram

I agree in part about some the observations you made regarding using Cognos for this type of reporting. They have an existing system in Crystal Reports/VB that enables the output like what I want. They are upgrading the OS and the current system is not compatible with the new OS. Hence is the decision to redo the reports in Cognos. Also, I do not have the privileges to write stored procs. I asked once before regarding that and I was told no.

That said, I still have this to solve. I made a little headway in this. If I do a

Lead(Code, 1) Over (Partition By ACCT_NUM, CERT_ID Order By Date)                                               (let us call this column as Lead_01)

I am able to move the set of codes by one row up and if I pick dates where

Code != Lead_01 or Lead_01 IS NULL

I can pick the last row's date for each group of codes. In effect, I am able to extract 1/1/2008 (for Code '10'), 1/1/2010 (for Code '20') and 1/1/2012 (for Code '10') for CERT_ID '1'. The question remains is how to fill those where

Code = Lead_01

with the corresponding set dates of Code != Lead_01.

Any ideas/suggestions are welcome.