I am having out of memory issues while trying to dedup a table consisting of huge amount of data.
Scenario :
Column A | Column B ( Date )
Value1 Date1
Value1 Date2
Value2 Date3
Value2 Date4
I need to dedup on both these columns, I need to pick the latest record using column b.
Lets say date2 and date4 are the latest dates. My output should be:
Column A | Column B ( Date )
Value1 Date2
Value2 Date4
Currently I am using the below query which works. Is there a better way of doing this using less memory.
CREATE TABLE UNIQUE_TABLENAME AS (
SELECT a.column a, a.column b, a.column c, a.column d
from tablename a,
(select column a,max(column b) from tablename group by column a)b
where a.column a = b.column a
and a.column b= b.column b)
Thanks in advance!
Check distinct on