My service connects to another service that as a low limit of requests / second

85 Views Asked by At

I have a service A which calls another service B (and some others) via REST APIs. The problem is service B has a very low capacity and it only handles 3 requests/second so many calls made to my service A are failing due to timeouts calling service B.

Is there any way to solve this issue? I was thinking about enqueue the calls to service B so at least calls to service A don't fail and also rate limiting my service A but I would like to see if there is any way to bypass service B limits somehow.

BTW, I don't have access to modify service B.

2

There are 2 best solutions below

0
Olivier Poupeney On

Routing between services is always an issue when you have to do it programmatically. Orchestration is a modern and elegant solution so that not only services A and B are not "glued" but you can easily manage rate limits, timeouts directly in the workflow definition. Check out the open source https://github.com/netflix/conductor You can even try it without installing anything by using Orkes conductor's playground (dev sandbox): https://play.orkes.io

1
Christophe Quintard On

If the call to service B has to be synchronous, then there's nothing you can do. User invokes service A, service A invokes service B, service B returns 429 (too many requests) to service A, service A returns 429 to the user. Do not set the rate limite on service A to protect service B, or you will create a configuration coupling between service A and service B. Let service B fails and forward the error.

If service A can invoke service B asynchronously, then you can switch to asynchronous processing. Setup a message queue. Each time service A receives a call, it writes a message into the message queue. Create an agent (another service) that consumes the messages. Everytime the agent receives a message, it invokes service B. The agent commits the message only when the call to service B succeeds (it may have to retry a couple of time before to succeed). The agent will consume the messages as fast as service B processes the requests.

Now you are left with a question : is service B able to process all the requests of service A in time ? For example, if service A receives a pike of 1800 requests in one second, then service B will take 600 seconds to process all thoses requests asynchrously. Is it acceptable ? If not, you will have to accept to drop some messages. One solution is to drop old messages : when service A writes a message, it writes the current time into the message, and when the agent receives a message, it looks at this date and drops messages too old. Some message queues can be configured to discard old messages automatically https://www.rabbitmq.com/ttl.html). You can also limit the size of the queue (https://www.rabbitmq.com/maxlength.html#overflow-behaviour). The queue can silently drop some messages, or rejects new messages so that service A can return a 429.