Hive- join table with itself with inequality conditions

126 Views Asked by At

I would like to join a table with itself to find a parent of every line with an equality conditions. Also I need to keep only one parent by taking the one who's the field "counts" is minimum. here's a sample data:

product  date1           date2          version   counts
A.16    15/04/2021      17/04/2021      1         20
A.18    18/04/2021      16/04/2021      1         15
A.15    17/04/2021      18/03/2021      1         18
A.20    19/04/2021      20/03/2021      1         33
A.35    20/04/2021      30/03/2021      1          8

here's the request:

select  b.product
      , b.date1
      , b.date2
      , b.version
      , c.product as parent
from pr_convert b
left join pr_convert c
on (c.date1< b.date1 and c.date2> b.date2)

This did not work, it threw me this error :

Both left and right aliases encountered in JOIN 'date1'

Then I tried to put the inequality condition in a where clause by doing this:

select  b.product
      , b.date1
      , b.date2
      , b.version
      , c.product as parent
from pr_convert b
left join pr_convert c
on b.version=c.version
where b.date1< c.date1 and b.date2> c.date2

As the value of the field "version" is the same for all records and my table being very large, it consumes a lot of resources and takes a lot of time.

Any help would be appreciated.

0

There are 0 best solutions below