Query array of JSON objects using Athena | Glue

21 Views Asked by At

I have the data in the format of JSON array which looks something like this:

[{"name": "abhi", "job": "developer"},{"name": "amal", "job": "captain"},{"name": "nizam", "job": "ca"},{"name": "akshay", "job": "doctor"}]

in glue, i created a crawler with custom classifier having jsonpath defined as $[*]. After crawling, it successfully identified the column names as

  1. name string
  2. job string

But when i am trying to query the table using Athena, i am not getting the output in the expected format.

Additional details:

  1. The file from which the data is crawled -> is residing in amazon s3.
  2. Query used in athena: SELECT * FROM "<data_source>"."<database_name>"."<table_name>";
  3. IAM role used is having all the necessary permissions
  4. Classification -> JSON
  5. My data is NOT following the format -> single JSON object per line

I was expecting the output to be

name job
abhi developer
amal captain
nizam ca
akshay doctor

The output which i received was

name job
{"name": "abhi", "job": "developer"} {"name": "amal", "job": "captain"}
0

There are 0 best solutions below