Handle very large throughput with external database

648 Views Asked by At

I'm looking to build a Java11 Spring boot application. The application is to handle very large throughput (will have peaks and low traffic)

The happy path of the application looks like this.

happy path

Conceptually its fairly straight forward. The steps roughly look like this

  • Accept Incoming POST request. DTO object at a save endpoint.
  • Application will then validate the DTO and return relevant Error message if it is invalid.
  • Convert to a database entity object
  • Save entity to a Postgres database.

The potential issue we have with this application is that It's going to do database saves per each request its alot of individual saves. The database connection pools can quickly run out the more connections that are made.

My alternative approach looks like this

internal queue

Im looking to return a status 200 once the incoming DTO passes validation and is queued up in a memory queue.
There is no external blocking here and should the database go down - meaning the internal queue will give some redundancy.

So some questions / ideas

  • Does this look like a good approach, are there any pitfalls I should look out for?
  • Maybe you have solved a similar issue in a better / different way?
  • Could reactive streams help in anyway?
  • What internal Java libraries should I use for this? My thinking was to go with Java's LinkedList Queue<SomeDto> myQ = new LinkedList<SomeDto>(); ) for queueing internally?
3

There are 3 best solutions below

0
Fouad HAMDI On BEST ANSWER

What happens if the app fails with data in the internal queue ? Or if there is an overflow of save operations in memory ?

If you want to build something more robust, you may consider an event-log solution (based on Kafka for example) with consumers populating the database (Kafka would replace your internal queue).

However, it is difficult to really answer your question here since many other elements must be taken into consideration.

I would suggest you to read a book like Designing Data-Intensive Applications: it is definitively a valuable resource and it will help you to design a reliable solution based on your needs and your context.

2
Pradyskumar On

If you are making a rest call I dont think you can keep all the request in a same linked list. You can use RabbitMQ for queuing purpose. As soon as the validation is successful you can push the object to queue and return 200

0
Nathan On

A much better solution would be to have a redundant database so that in the event that one of the systems goes down or is otherwise unavailable, you can continue to function with your second database.

Keeping the data to persist in memory is a solution I would advise against. You say that you are anticipating a relatively high peak. If your DB is unavailable during a high peak, I cannot believe that you would be able to queue all requests in memory for the necessary length of time. If they are in memory, then any kind of application server (or hardware problem that affects your application sever) would result in a complete loss of all of your queued requests. This means that your REST interface lied to its callers. You returned that you had successfully persisted the data, when you did not because both your DB and your application crashed.

You either need a redundant database or a persistant and external queueing system. If you opt for an external queueing system (which also can be redundant to prevent outages) then you could simply push all persist requests into the external queue. Then you only have one mechanism/workflow that you need to support.