Hive : Fill in missing columns

1k Views Asked by At

I am new to Hive and my SQL knowledge is bit rusty hence posting question on SO.

I have data from one SAP table in which values for one column are missing. I would like to know how can I populate those columns.

Example:

+----------+-----------+--------+
| EBELN    | BELNR     | EBELP  |
|----------+-----------+--------|
|    1     |   123     |  001   |
|----------+-----------+--------|
|          |   123     |  002   |
|----------+-----------+--------|
|          |   123     |  003   |
+----------+-----------+--------+

In above table, all the three rows are related to each other but value for EBELN column is present only in first row.

When I join this table with other table using EBELN column as a join key, only first row appears in the join.I want to bring all the three rows in the output join.

Expected output:
    +----------+-----------+--------+
    | EBELN    | BELNR     | EBELP  |
    |----------+-----------+--------|
    |    1     |   123     |  001   |
    |----------+-----------+--------|
    |    1     |   123     |  002   |
    |----------+-----------+--------|
    |    1     |   123     |  003   |
    +----------+-----------+--------+

I am expecting output to be something like this so that I can join this table with other table by using EBELN column.

Table to which I am joining does not contain BELNR or EBELP column so I cannot use those columns for joining.

I would like to know if I can use self join or is there anything else which I can use here.


Adding some more information. I have other table lets call it "ABCD". That table has EBELN column .

    +----------+-----------+--------+
    |   EBELN  |   COL12   |  COL13 |
    |----------+-----------+--------|
    |    1     |   ABC     |  LMN   |
    |----------+-----------+--------|
    |    2     |   DEF     |  OPQ   |
    |----------+-----------+--------|
    |    3     |   GHI     |  RST   |
    +----------+-----------+--------+

I was trying to join this "ABCD" table with "EKPF" table using EBELN column.

I need to get all the rows where BELNR number is same but since BELNR column is not present in ABCD table, I cannot use it for joining.

Final joined output should look like

    +----------+-----------+--------+--------+--------+
    | EBELN    | BELNR     | EBELP  | COL12  | COL13  |
    |----------+-----------+--------+--------+--------|
    |    1     |   123     |  001   |   ABC  |   LMN  |
    |----------+-----------+--------+--------+--------|
    |          |   123     |  002   |   ABC  |   LMN  |
    |----------+-----------+--------+--------+--------|
    |          |   123     |  003   |   ABC  |  LMN   |
    |----------+-----------+--------+--------+--------|
    |    2     |           |        |  DEF   |  OPQ   |
    |----------+-----------+--------+--------+--------|
    |    3     |           |        |  GHI   |  RST   |
    +----------+-----------+--------+--------+--------+
2

There are 2 best solutions below

0
On

From Hive you can use one of the Conditional Functions Function to achieve your output.

Simply do this if using COALSCE:

SELECT COALESCE(EBELN, 1L) AS EBELN, BELNR,EBELP FROM your_table;

Also from Hive 0.11 onwards you can use NVL Function

SELECT NVL(EBELN, 1) AS EBELN, BELNR,EBELP FROM your_table;
0
On

If you want for all the records from the first (LEFT) table to generate corresponding record in the result, even if there is no match in the second (RIGHT) table, you use LEFT join. For the reverse case, one use RIGHT join. Because you want to keep records from both sides of the join, you need FULL OUTER join.

SELECT COALESCE(t1.EBELN, t2.EBELN), t1.BELNR, t1.EBELP ,
       t2.COL12, t2.COL13
  FROM EKPF t1
  FULL OUTER JOIN ABCD t2 ON t1.EBELN = t2.EBELN

COALESCE here guarantees you will have the join column populated (as per your example).

The output would be:

+----------+-----------+--------+--------+--------+
| EBELN    | BELNR     | EBELP  | COL12  | COL13  |
|----------+-----------+--------+--------+--------|
|    1     |   123     |  001   |   ABC  |   LMN  |
|----------+-----------+--------+--------+--------|
|          |   123     |  002   |   ABC  |   LMN  |
|----------+-----------+--------+--------+--------|
|          |   123     |  003   |   ABC  |  LMN   |
|----------+-----------+--------+--------+--------|
|    2     |           |        |  DEF   |  OPQ   |
|----------+-----------+--------+--------+--------|
|    3     |           |        |  GHI   |  RST   |
+----------+-----------+--------+--------+--------+