I am new to Avro Schema. I had created the following schema based upon reference JSON but I am not able to create a serializer for this.
{
"name": "Name",
"type": "record",
"namespace": "NameSpace",
"fields": [
{
"name": "discussions",
"comment": "discussion ID.",
"type": {
"type": "array",
"items": {
"name": "discussionsRecord",
"comment": "discussion Identifier.",
"type": "record",
"fields": [
{
"name": "discussionId",
"type": "long"
},
{
"name": "channelType",
"comment": "channel Type Identification.",
"type": "int"
},
{
"name": "data",
"comment": "The following block is to capture channel values.",
"type": {
"type": "array",
"items":
[
{
"name": "dataRecord",
"type": "record",
"fields": [
{
"name": "pulse",
"comment": "Pulse.",
"type": "long"
},
{
"name": "communicationName",
"comment": "communication Identification.",
"type": {
"name": "communicationNameEnumType",
"comment": "enum for communication Names.",
"type": "enum",
"symbols": [
"cold", "rainIntensity", "heat"
]
}
},
{
"name": "communicationValue",
"comment": "communication Values.",
"type": "double"
},
{
"name": "classValue",
"comment": "communication class.",
"type": {
"name": "classValueEnumType",
"comment": "enum for Class types.",
"type": "enum",
"symbols": [
"Dark", "Logical"
]
}
}
]
}
]
}
}
]
}
}
}
]
}
If you have an AVSC schema, you can create a SparkSQL schema like this (scala)
Otherwise,
to_avro()
serializes an existing dataframe with its schema to Avro output