Alternative to DBs for storing audit logs?

237 Views Asked by At

I'm currently developing a system that is used for generating verification codes. The system authenticates a user, sends them a code, stores it locally, then at some later point receives the code from the user, after which point the code is spent and deleted from the system

The original proposal was to create this system using an SQL database, however this use case just screams Kafka, and since these codes have a very short duration (they expire in 30 seconds, which means that they must be spent almost immediately) it only makes sense to have producers and consumers that create and consume events that hold these codes.

However, the issue is that all events must be logged, i.e. upon code consumption/expiration something must be written in the audit log. Also, any attempt to use an invalid code must be logged as well. It seems like a waste to have a fully fledged database and then use it to have one AUDIT_LOG table in it.

Since we're using Spring boot we'd like to use some plug-and-play solution that can be easily interfaced with. I've seen suggestions of using Kafka itself to store audit logs, but we need long term persistence and Kafka doesn't seem ideal for that. Using a no-SQL DBs was also suggested, and I've seen some promising append-only solutions.

Do you folks have anything to recommend here?

2

There are 2 best solutions below

1
On BEST ANSWER

original proposal was to create this system using an SQL database

Use both? Kafka Connect can write to JDBC databases from Kafka topics. This way, you don't need to write JDBC Clients within your Java code, and non-Java clients can also send data to the same Kafka topic since the protocol is generic TCP, not specific to your database.

If that SQL database doesn't work, you can still consume the same Kafka topic to a different system (Elasticsearch works great for log analysis, as does HDFS or S3 for larger datasets). Since you mention "offline" systems, then MinIO is a self-hosted, S3 compatible option.

2
On

According to it's home page kafka is for

data pipelines, streaming analytics, data integration, and mission-critical applications

But you stated that an important functionality of your application is persistent storage of structured data. So it's highly doubtable whether kafka is a good long term fit for this purpose if that purpose isn't advocated by kafka itself.

Also keep in mind that there is probably more to your application than just the use cases. Maybe at some point reporting functionality is required. Than you are probably way better of leveraging a reporting product that probably a lot of people already know or have worked with than reinventing all of that on top of kafka (even though streaming analytics is a use case for kafka, but it probably is way lesser known than for example an sql reporting tool).