Get max value of a column in my source table in Informatica

6.5k Views Asked by At

I have the following table with the shown data in it:

send_date  |  household_ID
-------------------------
11-20-2014 | 123
11-20-2014 | 456
11-15-2014 | 789 

I need to do 2 things:

1) Calculate the max value for send_date

2) Filter out any rows whose send_date does not match that value

In other words, I want the output to be:

send_date  |  household_ID
-------------------------
11-20-2014 | 123
11-20-2014 | 456

Row number 3 should be filtered out as its send_date is not the max.

I tried creating an aggregate, grouping by all columns, and creating a new output port called MAX_DATE with an expression of MAX(SEND_DATE), then have a filter transformation with the condition MAX_DATE = SEND_DATE

This lets all rows through, though. What can I do to make this work....

3

There are 3 best solutions below

1
On BEST ANSWER
  • Use two source qualifiers for the same source table or - better yet - stay with a single SQ but split data into two pipelines (see the comment below for details).
  • Calculate MAX_DATE value using the first one and an aggregator.
  • Then join data from the second source qualifier on the SEND_DATE column.
2
On

Are you sure this is not what you want?

SELECT * FROM `table` WHERE `send_date` <> (SELECT MAX(`send_date`) FROM `table`)

Or, if I misunderstood you, then:

SELECT * FROM `table` WHERE `send_date` = (SELECT MAX(`send_date`) FROM `table`)
0
On

To my opinion, the simplest way to solve the issue is to use RANK transformation. Advantage: - No splits in SQ - No joins after aggregation - Simple :) The transformation is ranking top/bottom (according your configuration; you need to choose "Top") So, it will pass through the whole row then the maximal send_date value will be ranked as 1. The next step is to filter all rows with the rank greater than 1.