If you are unable to create a new account, please email support@bspsoftware.com

 

Ranking and filtering on multiple columns based on multiple conditions

Started by bi.user, 27 Sep 2016 11:10:54 AM

Previous topic - Next topic

bi.user

Hello Gurus
Need your help to get the data. I have a requirement where data appears like below.

Policy   Member   Category           St date               End Date
123           X0700   I             Oct 29, 2014   
123           V0700   S              Jul 18, 2014   
124           V1800   I              Jun 21, 2011   
124           X0800   I              Nov 18, 2010      Oct 29, 2016
125           V1700   I              Oct 29, 2014              Oct 29, 2016
125           X0700   I              Oct 29, 2014              Oct 29, 2016
125           V0700   I               Apr 10, 2011      Oct 29, 2016
126           V0700   I               Mar 30, 2012      Oct 29, 2015
126          X0700   I               Mar 30, 2012      Oct 29, 2016

Conditions :
A. If a policy number (123) has multiple members appearing in Category I and S, then exclude rows with I and show only row with S.
B. If a policy number (124) has multiple members within same Category, then show only the row which do not have End date.
C. If a policy number (125) has multiple members where all rows have End date, then show only the row with earliest Start date.
D. f a policy number (126) has multiple members where all rows have same start dates, then sort members in this order 'X0', 'V0', 'V1' and show only the top member row.

Required output as shown below:

Condition   Policy   Member   Category        St date                End Date
A           123          V0700   S            Jul 18, 2014   
B           124          V1800   I            Jun 21, 2011   
C           125         V0700   I            Apr 10, 2011               Oct 29, 2016
D           126         X0700   I            Mar 30, 2012               Oct 29, 2016


Please let me know how to get this done.

Note: My bad, missed to mention that these conditions are to be followed in same order (priority) i.e Check for Condition B, if its not satisfied, check for condition C, then D.
Condition A is the only one with two different categories where we need to check and show if there is Category S. 

This is relational Data.

Thanks

AnalyticsWithJay

Assuming this is relational data:


A.

Data Item: 'Category in I'

total(
CASE WHEN [Category] = 'I'
THEN 1
ELSE 0
END
for [Policy])


Data Item: 'Category in S'

total(
CASE WHEN [Category] = 'S'
THEN 1
ELSE 0
END
for [Policy])


Filter:
[Category in I] > 0 AND [Category in S] > 0 and [Category] = 'S'



B. Same concept as A), except use AND [End Date] IS NULL in the filter

C.

Data Item: '# of Rows Without End Date'

total(
CASE WHEN [End Date] IS NULL
THEN 1
ELSE 0
END
for [Policy])


Filter: [# of Rows Without End Date] = 0 AND [End Date] = minimum([End Date])

D.
Same concept as C. Create a data item called Sort Key. If the condition is true, specify the data item or substring that you want to filter on. If false, specify the data item you should use instead. You will set the sorting property to use [Sort Key]. I see you updated the requirements for D from a sort to a filter. Just wanted to strike out my original response so you don't use it.



AnalyticsWithJay

Quote from: bi.user on 27 Sep 2016 11:10:54 AM
Note: My bad, missed to mention that these conditions are to be followed in same order (priority) i.e Check for Condition B, if its not satisfied, check for condition C, then D.
Condition A is the only one with two different categories where we need to check and show if there is Category S. 

This is relational Data.

Thanks

Hey bi.user,

Just saw your update. You can still use the syntax I described, except instead of adding different filters, create one filter with parenthesis like so:

(CONDITIONS IN A)
OR
(CONDITIONS IN B)
OR
(CONDITIONS IN C)
OR
(CONDITIONS IN D)

The first condition that meets the criteria, in the above order, will be used and the statement will exit.

I'm in a bit of a rush at the moment so I gave you the quick syntax :) Let me know when you implement this if you run into any issues.

Cheers

bi.user

Hey CognoidJay

Thanks for the quick response.

Condition A works great. I am getting multiple rows instead of one row for a policy, for B & C when I used them as below. D need to sort and show only the top row.

B) Created two dataitems as in A. Filter "[term  null count]> 0 and [term  nt null count] > 0 and [end Date] is null".

C) I am getting multiple rows for each policy when I used the logic with filter. 

Am I missing something?


AnalyticsWithJay

Quote from: bi.user on 27 Sep 2016 03:41:38 PM
Hey CognoidJay

Thanks for the quick response.

Condition A works great. I am getting multiple rows instead of one row for a policy, for B & C when I used them as below. D need to sort and show only the top row.

B) Created two dataitems as in A. Filter "[term  null count]> 0 and [term  nt null count] > 0 and [end Date] is null".

C) I am getting multiple rows for each policy when I used the logic with filter. 

Am I missing something?

My pleasure.

For "B. If a policy number (124) has multiple members within same Category, then show only the row which do not have End date":

Syntax should be:


  ([Category in I] > 0 OR [Category in S] > 0)
      AND
  ([End Date] IS NULL)
)


I changed the first 'AND' to an 'OR', and I put parenthesis around your category condition. Because you only need one of the categories to match that condition, whereas previously you were saying it has to have multiple rows in both categories.

For C, you're correct - the logic I gave you is incomplete, because we're not verifying that ALL rows are null. Also, since I cannot see your data, we me have to make a small change to the minimum() function.

Data Item: 'All Rows Are Null'


maximum(
  CASE WHEN [End Date] IS NULL
   THEN 'Y'
   ELSE NULL
  END
for [Policy])



Filter: ([All Rows Are Null] = 'Y' AND [End Date] = minimum([End Date] for [Policy]) )


Jay

bi.user

Hello Jay

Thanks for your time and help.

I have applied Conditions A, B, C on our original data set(bigger data set with multiple possible scenarios). Below are the results.

A) It works fine with our sample data (only few rows) but when I applied it on the original data set, it is also blocking rows which should be displayed i.e data I gave in my first post, once I apply this filter, it is blocking 124, 125, 126 policies as they have category  'I' and only displays 123.

B and C are working fine when I apply them individually but when i put them in one single filter exp like (CONDITIONS IN B) OR (CONDITIONS IN C) then they are not able to filter out data and I see all rows as if there is no filter.

D) As of now, I am creating a column "X" using Case logic to assign members with numbers like when "member contains '%X0' then 1... etc" and I am using filter to bring in minimum of the number in filter i.e X = min(X) . Is there any other logic to do this in a better way?

Please have a look and let me know where am I going wrong.


AnalyticsWithJay

Quote from: bi.user on 30 Sep 2016 02:55:32 PM
Hello Jay

A) It works fine with our sample data (only few rows) but when I applied it on the original data set, it is also blocking rows which should be displayed i.e data I gave in my first post, once I apply this filter, it is blocking 124, 125, 126 policies as they have category  'I' and only displays 123.

Please have a look and let me know where am I going wrong.

Hi,

That behavior is intended given your requirements and data set. You said if a policy number (123) has multiple members appearing in Category I and S, then exclude rows with I and show only row with S.

Policy 123 is the only one that has a row with S, so our results are correct.

Quote from: bi.user on 30 Sep 2016 02:55:32 PM
B and C are working fine when I apply them individually but when i put them in one single filter exp like (CONDITIONS IN B) OR (CONDITIONS IN C) then they are not able to filter out data and I see all rows as if there is no filter.

Pay attention to the parenthesis in your filter. If they work individually there's no reason why they wouldn't work with proper parenthesis.

For D, to check that all rows have the same start dates, you could do:
Count(distinct [Start Date] for [Policy])...
I don't know what your requirement is for how you define your top row, but you could check when the distinct count = 1 to make sure all rows have the same date. Just append your logic for the top row in your filter.

biusercog

I was able to fix the filters the way you suggested. Thanks a lot for your time and help Jay, really appreaciate it :D

AnalyticsWithJay

Quote from: biusercog on 05 Oct 2016 02:21:38 PM
I was able to fix the filters the way you suggested. Thanks a lot for your time and help Jay, really appreaciate it :D

Awesome! My pleasure :)