I have an Azure Event Hub - which offers a Kafka compliant interface - with protobuf encoded events on. I’d like to find an efficient way to continuously react to those events and write them to delta.
I can use databricks for this but it’s too costly for such a simple operation - I don’t need big data tooling.
I’ve also looked at Azure’s Stream Analytics but it has a relatively high cost still for such a simple operation.
I found this “highly efficient daemon” called Kafka Delta Ingest which would be perfect but only works with avro or json.
How can I write to delta without using costly big data tooling?
If you're receiving Protobuf encoded messages (events), you have the option of re-encoding them as JSON which you would then be able to pass on to Delta. The pattern would be:
The way in which you format the object as JSON might vary by language. It was certainly added to the C++ generated code as far back as version 3.11.2. Java has
com.google.protobuf.util.JsonFormat. C# getsJsonFormatter,JsonParser, Go getsprotojson.I don't know if the JSON format is stable / standardised, i.e. will the JSON output by the C# JSON formater comply with the C++ JSON parser. I would hope that it is, but you may want to check. I'm slightly wary because of the use of terms like "format" instead of "serialise", as if it's intended for pretty output rather than a formal contract between sender and receiver.