I am working with Apache NiFi and I need to parse and transform incoming binary data that follows a specific structure defined by the format string '<BBHBBHHHHHHh'(20bytes) per message (similar to Python's struct module format strings). The data represents a sequence of fields with different types (unsigned char, unsigned short, signed short), and I need to extract these values for further processing in my NiFi flow. It was very easy to consume from mqtt with python and use struct to convert it to json but how todo that in nifi? I managed to consume the mqtt but couldn decode. I will get many of these 20 bytes packets per second in the future and i want to have it stable and performant.
I understand that NiFi primarily handles text-based or more general data structures like JSON or XML, and I am looking for the best approach to handle this binary data parsing within NiFi without relying on external scripts or tools if possible.
Here is what I have considered or tried so far:
- Record-based processors like
ConvertRecord, but I am not sure how to configure a record reader for binary data. - Scripted processors like
ExecuteScriptorInvokeScriptedProcessor, but I am concerned about performance and the complexity of handling binary data structures in Jython or Groovy. - External tools or scripts, which I'd prefer to avoid to keep the processing within NiFi's managed environment.
- Best practices or patterns for integrating such binary data parsing in a NiFi data flow.
Any suggestions or insights from your experiences would be greatly appreciated! If nifi is not the right tool, im open to other tools, i did look into flink but that looked like to complex and i like the flow idea from nifi. I also did look at redis gears but that does not look stable from the api developement.