How to reboot application without losing the TreeMap kept in memory?

250 Views Asked by At

In a Spring Boot application, I keep a TreeMap in memory. I'm doing around 10,000 operations per second, and it may increase. To improve performance, I kept data in memory. I want my app to be able to start from the same state when application is restarted.

There are some methods I could find for this.

  1. Keeping data on Hazelcast. In this case I don't risk losing the data unless the Hazelcast dies, but if the Hazelcast dies, I can't restore data. Additionally, I don't think it makes sense to sync that amount of operations on Hazlecast.

  2. Synchronizing events to database. Here, my risk of data loss is very low. However, I need to execute a query after each operation. This may affect performance. Also, I need to handle exceptions on database update.

  3. Synchronizing data in batches There is only one ready solution that I could find here, MapDB. I'm planning to try it but I haven't tried it yet. If there is a more reliable, optimized sink solution that also uses db instead of file, I would prefer to use it.

Any recommendation to solve this question?

1

There are 1 best solutions below

0
On

Do you need a Map or a TreeMap ? Is collating sequence relevant for storage, for access or neither.

For Hazelcast, the chance for data loss is configurable. You set up a cluster with the level of resilience you want. This is the same as with disk, if you have one disk and it fails, you lose data. If you have two and one goes offline, you don't lose data. You allocate hardware for the level of resilience you need. Three is the recommended minimum.

(10,000 per second isn't worrying either, 1,000,000,000 has been done. Sync to an external store can be immediate or in batches)

Disclaimer, I work for Hazelcast, but I think your question is more fundamental -- how do you keep your store available.

Simply, don't restart. Clustered solutions are the answer here. If you have multiple nodes, the service as a whole stays running even if a few nodes go offline. Do rolling bounces.

If you must restart everything at once, what matters is how quickly can your service bring all data back and what does it do when the restore is 50% done (is 50% data visible?). Immediate replication to elsewhere is only really necessary if you have a clustered solution that hasn't been configured for resilience. Saving intermittently is fine if you have solved resilience.

So, configure your storage so that it doesn't go offline, makes the solution options for backup/restore all the easier.