In my simple maven application I have 3 avro files:
ReportDetails.avsc
{
"type": "record",
"name": "ReportDetails",
"namespace": "com.vl.model.avro",
"fields": [
{"name": "detailId", "type": "string"},
{"name": "detailName", "type": "string"}
]
}
Employee.avsc
{
"fields": [
{ "name": "employeeId", "type": "string"},
{ "name": "position", "type": "string" },
{ "name": "department", "type": "int" },
{"name": "employeeName", "type": "string"}
],
"name": "Employee",
"namespace": "com.vl.model.avro",
"type": "record"
}
Report.avsc
{
"type": "record",
"name": "Report",
"namespace": "com.vl.model.avro",
"fields": [
{"name": "reportId", "type": "string"}
, {"name": "employee", "type": ["null", "com.vl.model.avro.Employee"], "default": null}
, {"name": "details", "type": {"type": "array", "items": "com.vl.model.avro.ReportDetails"}}
]
}
the plugin configuration is
<plugin>
<groupId>org.apache.avro</groupId>
<artifactId>avro-maven-plugin</artifactId>
<version>1.11.3</version>
<executions>
<execution>
<phase>generate-sources</phase>
<goals>
<goal>schema</goal>
</goals>
<configuration>
<sourceDirectory>${project.basedir}/src/main/resources/avro</sourceDirectory>
<outputDirectory>${project.basedir}/src/main/java/</outputDirectory>
<enableDecimalLogicalType>true</enableDecimalLogicalType>
<stringType>String</stringType>
<fieldVisibility>PRIVATE</fieldVisibility>
<includes>
<include>ReportDetails.avsc</include>
<include>Employee.avsc</include>
<include>Report.avsc</include>
</includes>
</configuration>
</execution>
</executions>
</plugin>
The first stage issue
so this fails because of
[INFO] --- avro:1.11.3:schema (default) @ spring-cloud-stream-kafka-streaming-example ---
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 2.693 s
[INFO] Finished at: 2024-02-01T14:43:20+02:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.avro:avro-maven-plugin:1.11.3:schema (default) on project spring-cloud-stream-kafka-streaming-example: Execution default of goal org.apache.avro:avro-maven-plugin:1.11.3:schema failed: Undefined name: "com.vl.model.avro.Employee" -> [Help 1]
to make it working i updated the Report configuration (Employee and ReportDetails without changes)
{
"type": "record",
"name": "Report",
"namespace": "com.vl.model.avro",
"fields": [
{"name": "reportId", "type": "string"}
, {"name": "employee", "type": ["null", {"type": "record", "name": "Employee", "fields": []}], "default": null}
, {"name": "details", "type": {"type": "array", "items": {"type": "record", "name": "ReportDetails", "fields": []}}}
]
}
It looks fixed my avro:1.11.3:schema generating issue and works fine because generates things exactly I need.
Following next step. Share the schemas.
Avro schemas has to be registered in schema registry service for sharing on different micro services. To cover this need i've configured kafka-schema-registry-maven-plugin (7.5.1), which can upload and download models.
So the plugin configuration is:
<plugin>
<groupId>io.confluent</groupId>
<artifactId>kafka-schema-registry-maven-plugin</artifactId>
<version>7.5.1</version>
<executions>
<execution>
<id>avro-resources</id>
<phase>generate-sources</phase>
<goals>
<goal>download</goal>
</goals>
</execution>
</executions>
<configuration>
<schemaRegistryUrls>
<param>http://localhost:8081</param>
</schemaRegistryUrls>
<outputDirectory>src/main/avro</outputDirectory>
<subjectPatterns>
<param>^com.vl.model.*$</param>
</subjectPatterns>
<versions>
<param>latest</param>
</versions>
<subjects>
<com.vl.model.ReportDetails>src/main/resources/avro/ReportDetails.avsc</com.vl.model.ReportDetails>
<com.vl.model.Employee>src/main/resources/avro/Employee.avsc</com.vl.model.Employee>
<com.vl.model.Report>src/main/resources/avro/Report.avsc</com.vl.model.Report>
</subjects>
<schemaTypes>
<com.vl.model.ReportDetails>AVRO</com.vl.model.ReportDetails>
<com.vl.model.Employee>AVRO</com.vl.model.Employee>
<com.vl.model.Report>AVRO</com.vl.model.Report>
</schemaTypes>
<references>
<com.vl.model.Report>
<reference>
<name>details</name>
<subject>com.vl.model.ReportDetails</subject>
</reference>
<reference>
<name>employee</name>
<subject>com.vl.model.Employee</subject>
</reference>
</com.vl.model.Report>
</references>
</configuration>
</plugin>
Subjects registration issue
Registering schemas mvn schema-registry:register leads to different issue
[INFO] --- schema-registry:7.5.1:register (default-cli) @ spring-cloud-stream-kafka-streaming-example ---
[INFO] Registered subject(com.vl.model.Overtime) with id 3 version 1
[INFO] Registered subject(com.vl.model.Absence) with id 4 version 1
[INFO] Registered subject(com.vl.model.Employee) with id 5 version 1
[INFO] Registered subject(com.vl.model.ReportDetails) with id 6 version 1
[INFO] Registered subject(com.vl.model.Attendance) with id 7 version 1
[ERROR] Could not parse Avro schema
org.apache.avro.SchemaParseException: Can't redefine: com.vl.model.avro.Employee
at org.apache.avro.Schema$Names.put (Schema.java:1550)
at org.apache.avro.Schema$Names.add (Schema.java:1544)
at org.apache.avro.Schema.parse (Schema.java:1665)
at org.apache.avro.Schema.parse (Schema.java:1765)
at org.apache.avro.Schema.parse (Schema.java:1678)
at org.apache.avro.Schema$Parser.parse (Schema.java:1433)
at org.apache.avro.Schema$Parser.parse (Schema.java:1421)
at io.confluent.kafka.schemaregistry.avro.AvroSchema.<init> (AvroSchema.java:120)
at io.confluent.kafka.schemaregistry.avro.AvroSchemaProvider.parseSchemaOrElseThrow (AvroSchemaProvider.java:54)
at io.confluent.kafka.schemaregistry.SchemaProvider.parseSchema (SchemaProvider.java:114)
at io.confluent.kafka.schemaregistry.SchemaProvider.parseSchema (SchemaProvider.java:123)
at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.parseSchema (CachedSchemaRegistryClient.java:286)
at io.confluent.kafka.schemaregistry.client.SchemaRegistryClient.parseSchema (SchemaRegistryClient.java:61)
at io.confluent.kafka.schemaregistry.maven.UploadSchemaRegistryMojo.processSubject (UploadSchemaRegistryMojo.java:120)
at io.confluent.kafka.schemaregistry.maven.UploadSchemaRegistryMojo.execute (UploadSchemaRegistryMojo.java:92)
at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo (DefaultBuildPluginManager.java:126)
at org.apache.maven.lifecycle.internal.MojoExecutor.doExecute2 (MojoExecutor.java:328)
at org.apache.maven.lifecycle.internal.MojoExecutor.doExecute (MojoExecutor.java:316)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:212)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:174)
at org.apache.maven.lifecycle.internal.MojoExecutor.access$000 (MojoExecutor.java:75)
at org.apache.maven.lifecycle.internal.MojoExecutor$1.run (MojoExecutor.java:162)
at org.apache.maven.plugin.DefaultMojosExecutionStrategy.execute (DefaultMojosExecutionStrategy.java:39)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:159)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:105)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:73)
at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:53)
at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:118)
at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:261)
at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:173)
at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:101)
at org.apache.maven.cli.MavenCli.execute (MavenCli.java:906)
at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:283)
at org.apache.maven.cli.MavenCli.main (MavenCli.java:206)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:77)
at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke (Method.java:568)
at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:283)
at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:226)
at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:407)
at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:348)
[ERROR] Schema for com.vl.model.Report could not be parsed.
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
because we can't declare two times "type": "record", "name": "Employee" or "ReportDetails".
Subject registration solution
Trying to resolve the last issue I updated Report description
{
"type": "record",
"name": "Report",
"namespace": "com.vl.model.avro",
"fields": [
{"name": "reportId", "type": "string"}
, {"name": "employee", "type": {"type": "Employee", "java-class": "com.vl.model.avro.Employee"}}
, {"name": "details", "type": {"type": "array", "items": "com.vl.model.avro.ReportDetails"}}
]
}
It is in the schema registry now.
Different part is broken
Avro schemas definitely published to the schema registry. But neither local nor remote can't help generate java classes.
The pulled avro files from schema registry and you can find that they are different a little. The issue for remote schema:
[INFO] --- avro:1.11.3:schema (default) @ spring-cloud-stream-kafka-streaming-example ---
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 2.753 s
[INFO] Finished at: 2024-02-01T16:59:40+02:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.avro:avro-maven-plugin:1.11.3:schema (default) on project spring-cloud-stream-kafka-streaming-example: Execution default of goal org.apache.avro:avro-maven-plugin:1.11.3:schema failed: "Employee" is not a defined name. The type of the "employee" field must be a defined name or a {"type": ...} expression. -> [Help 1]
for local schema Execution default of goal org.apache.avro:avro-maven-plugin:1.11.3:schema failed: Type not supported: Employee .
The downloaded Report.avsc from schema registry (I formatted one to improve readability) looks like:
{
"type": "record",
"name": "Report",
"namespace": "com.vl.model.avro",
"fields": [
{
"name": "reportId",
"type": "string"
},
{
"name": "employee",
"type": "Employee"
},
{
"name": "details",
"type": {
"type": "array",
"items": "ReportDetails"
}
}
]
}
(missed com.vl.model.avro namespace before types like a Employee, ReportDetails)
Questions
As you can see, resolving one issue we are getting different one. I'll be happy to
- Get solution for any described issue without side effect.
- Get an idea connected to different maven plugins to cover my needs.
- Get a scenario to be checked that I've not checked yet.
P.S.
I didn't removed avsc references because I sure it bring serialisation issues at kafka communication time.
P.S.2. Solution should generate java data classes using maven builder. Java classes will be used for publishing kafka messsages.
It seems, the correct schema pulled from schema registry (it's logical on other case nobody would use it in their projects.)
So the issue might be with plugin. Verifying the plugin api, I've realised that I missed in my configuration something important.
So
importssection was missed.P.S. To fix it for pulled schemas in the
importssection (andincludesalso) should be files from pulled folder.