Hive - Get difference between two Hive tables based on one column

768 Views Asked by At

I have two Hive tables as follows:

Table1:

c1 | c2 | c3
01 | june | true
02 | may | false


Table 2:

c1 | c4
01 | usa

I basically want to get the difference (wrt set operations context) between Table A and Table based on C1. That is, I am looking for all the rows containing c1 values that are in Table 1 but NOT in Table 2. From the above example, I need to get the second row from Table 1 as my query result.

I tried the following Hive query:

select c1 from table1 a left outer join table2 b

on a.c1 = b.c1 where b.c1 is null

I am getting the following error:

Error while compiling statement: FAILED: SemanticException Column c1 Found in more than One Tables/Subqueries

Both Table 1 and Table 2 have their first columns named as c1. And this is something that I cannot change.

Is there something that I am missing here? Thanks in advance!

1

There are 1 best solutions below

1
On BEST ANSWER

It's complaining because you didn't specify the source table for c1 in select c1 from. You need to specify which table you want the c1 from since it exists in both tables. select a.c1 from... should give you what you want.