I am doing performance testing of Kafka and need to test different large schemas. At the moment, I am working on Avro-based load testing.
Usually, when working with Kafka, you have data and generate a schema from that. I must test several schemas in this scenario, for which I don't own data. I need to generate sample Avro data based on the existing schema.
What are the possible solutions?
Tried solutions:
- I have tried making data manually, but it is too tedious.
- Searched for auto-generators with no luck
- Searched official Kafka documentation for the possible solution
- I tried writing my own but have little experience with Avro, so it seems too custom and non-maintainable to continue
How to generate sample data based on the existing Avro schema?
If you are comfortable with Python the
fastavro
library has utilities to generate data from the schema: https://fastavro.readthedocs.io/en/latest/utils.htmlAs an example: