We changed a field to allow null and now previous json don't work anymore returning a AvroTypeException: Unknown union branch.
Here the previous (working) avro file and json for the test: myobject.avsc
{
"namespace":"my.model.kafka.test",
"type":"record",
"name":"MyObject",
"fields":[
{
"name":"First_Level",
"type":[
"null",
{
"type":"record",
"name":"FirstLevel",
"fields":[
{
"name":"TheTimestamp",
"doc":"Timestamp",
"type":{
"type":"long",
"logicalType":"timestamp-micros"
}
},
{
"name":"CategoryCode",
"type":{
"type":"enum",
"name":"Code",
"symbols":[
"A",
"B"
]
}
},
{
"name":"SecondLevel",
"type":{
"type":"record",
"name":"SecondLevel",
"fields":[
{
"name":"ThirdLevel",
"type":{
"type":"array",
"items":[
{
"type":"record",
"name":"ThirdLevel",
"fields":[
{
"name":"LocationCode",
"type":"string"
},
{
"name":"SomeCode",
"type":"string"
},
{
"name":"Cost",
"type":"int"
}
]
}
]
}
}
]
}
},
{
"name":"UID",
"type":[
"null",
"string"
],
"default":null
}
]
}
],
"default":null
}
]
}
Here the json of the test:
{
"First_Level" : {
"my.model.kafka.test.FirstLevel" : {
"TheTimestamp" : 1648808100000000,
"CategoryCode" : "A",
"SecondLevel" : {
"ThirdLevel" : [ {
"my.model.kafka.test.ThirdLevel" : {
"LocationCode" : "BBB",
"SomeCode" : "AAA",
"Cost" : 2
}
}, {
"my.model.kafka.test.ThirdLevel" : {
"LocationCode" : "CCC",
"SomeCode" : "BBB",
"Cost" : 2
}
} ]
},
"UID" : "123-9jh789-opi8p83h3"
}
}
}
Modification to allow null Here everything work fine, but if we make the SecondLevel nullable by changing the avsc file to the following we get the AvroTypeException: Unknown union branch:
{
"namespace":"my.model.kafka.test",
"type":"record",
"name":"MyObject",
"fields":[
{
"name":"First_Level",
"type":[
"null",
{
"type":"record",
"name":"FirstLevel",
"fields":[
{
"name":"TheTimestamp",
"doc":"Timestamp",
"type":{
"type":"long",
"logicalType":"timestamp-micros"
}
},
{
"name":"CategoryCode",
"type":{
"type":"enum",
"name":"Code",
"symbols":[
"A",
"B"
]
}
},
{
"name":"SecondLevel",
"type":[
"null",
{
"type":"record",
"name":"SecondLevel",
"fields":[
{
"name":"ThirdLevel",
"type":{
"type":"array",
"items":[
{
"type":"record",
"name":"ThirdLevel",
"fields":[
{
"name":"LocationCode",
"type":"string"
},
{
"name":"SomeCode",
"type":"string"
},
{
"name":"Cost",
"type":"int"
}
]
}
]
}
}
],
"default":null
}
]
},
{
"name":"UID",
"type":[
"null",
"string"
],
"default":null
}
]
}
],
"default":null
}
]
}
Which give a
org.apache.avro.AvroTypeException: Unknown union branch ThirdLevel
even if I change the json to include the namespace before the thirdlevel, like in the other stackoverflow answer I get the same error:
org.apache.avro.AvroTypeException: Unknown union branch my.model.kafka.test.ThirdLevel
My question is twofold:
How to modified the avsc so the old json will work and new json that may have the SecondLevel null work too? We need to make this work but ultimately we need to be backward compatible too, so changing name or the json should be avoided.
EDIT:
After running the edited avsc vs kafka data directly the old message and new message were both working perfectly fine. We have a process that save the message in a json files and the json from that process were the one with the problem. Since the backward compatibility was needed only for the kafka consumer only, these change are actually fine.
For those who wonder here how the json should look like after adding the null type to SecondLevel:
{
"First_Level":{
"my.model.kafka.test.FirstLevel":{
"TheTimestamp":1648808100000000,
"CategoryCode":"A",
"SecondLevel":{
"my.model.kafka.test.SecondLevel":{
"ThirdLevel":[
{
"my.model.kafka.test.ThirdLevel":{
"LocationCode":"BBB",
"SomeCode":"AAA",
"Cost":2
}
},
{
"my.model.kafka.test.ThirdLevel":{
"LocationCode":"CCC",
"SomeCode":"BBB",
"Cost":2
}
}
]
}
},
"UID":"123-9jh789-opi8p83h3"
}
}
}