How can I turn this list of JSON objects into a Spark dataframe?
[
{
'1': 'A',
'2': 'B'
},
{
'1': 'A',
'3': 'C'
}
]
into
1 2 3
A B null
A null C
I've tried spark.read.json(spark.sparkContext.parallelize(d)) and various combinations of that with json.dumps(d).
You can use
spark.createDataFrame(d)to get the desired effect.You do get a deprecation warning about inferring schema from dictionaries, so the "right" way to do this is to first create the rows:
then create the DataFrame: