I have two Hive tables as follows:
Table1:
c1 | c2 | c3
01 | june | true
02 | may | false
Table 2:
c1 | c4
01 | usa
I basically want to get the difference (wrt set operations context) between Table A and Table based on C1. That is, I am looking for all the rows containing c1 values that are in Table 1 but NOT in Table 2. From the above example, I need to get the second row from Table 1 as my query result.
I tried the following Hive query:
select c1 from table1 a left outer join table2 b
on a.c1 = b.c1 where b.c1 is null
I am getting the following error:
Error while compiling statement: FAILED: SemanticException Column c1 Found in more than One Tables/Subqueries
Both Table 1 and Table 2 have their first columns named as c1. And this is something that I cannot change.
Is there something that I am missing here? Thanks in advance!
It's complaining because you didn't specify the source table for
c1inselect c1 from. You need to specify which table you want thec1from since it exists in both tables.select a.c1 from...should give you what you want.