Explode array of maps with values in HIVE or Impala

75 Views Asked by At

I have a data table with two columns ID and Details that looks like: enter image description here

Here is the code to create the above dataset:

select stack(4,
             115,array(map('AA_Desc', 'Sale', 'AA_Date', '202403', 'AA_Block', '0')),
             214,array(map('AA_Desc', 'Ret', 'AA_Date', '202402', 'AA_Block', '1')),
             54,array(map('AA_Desc', 'Sale', 'AA_Date', '202312', 'AA_Block', '0')),
             31, array(map('AA_Desc', 'Can', 'AA_Date', '202312'), map('AA_Desc', 'Ret', 'AA_Date', '202312','AA_Block', '0'))
            ) as (id, arr_col)

The data type for column 'details' is array<map<string,string>> and I want to extract certain values from the details field for every ID. Ideally an output in the below format would be great either in HIVE or Impala. I have not worked with this data type structure before so any help would be appreciated. Thanks

Output:

enter image description here

1

There are 1 best solutions below

0
summer7 On

USING IMPALA

From the 'details' values you provided, it might be easier to try using a data type like array<struct<map<string,string>,map<string,string>,map<string,string>>>

To create this, you could try something like

create TABLE AAtable
(
  id BIGINT,

  details ARRAY < STRUCT <
    AADesc: MAP <STRING, STRING>,
    AADate: MAP <STRING, STRING>,
    AABlock: MAP <STRING, STRING> > >
)
STORED AS PARQUET;

And then to extract what you want, use a query like

select tb.id, dt.aadesc.value, dt.aadate.value, dt.aablock.value 
    from AAtable tb INNER JOIN tb.details dt