Message Bus versus Quasar/HTTP for internal Microservice Calls

1.2k Views Asked by At

I am looking to optimize a microservice architecture that currently uses HTTP/REST for internal node-to-node communication.

One option is implementing backpressure capability into the services, (eg) by integrating something like Quasar into the stack. This would no doubt improve things. But I see a couple challenges. One is, the async client threads are transient (in memory) and on client failure (crash), these retry threads will be lost. The second, in theory, if a target server is down for some time, the client could eventually reach OOM attempting retry because threads are ultimately limited, even Quasar Fibers.

I know it's a little paranoid, but I'm wondering if a queue-based alternative would be more advantageous at very large scale.

It would still work asynchronously like Quasar/fibers, except a) the queue is centrally managed and off the client JVM, and b) the queue can be durable, so that in the event client and or target servers go down, no in flight messages are lost.

The downside to queue of course is that there are more hops and it slows down the system. But I'm thinking there is probably a sweet spot where Quasar ROI peaks and a centralized and durable queue becomes more critical to scale and HA.

My question is:

Has this tradeoff been discussed? Are there any papers on using a centralized external queue / router approach for intraservice communication.

TL;DR; I just realized I could probably phrase this question as:

"When is it appropriate to use Message Bus based intraservice communication as opposed to direct HTTP within a microservice architecture."

1

There are 1 best solutions below

0
On

I've seen three general protocol design patterns with microservices architectures, when running at scale:

  1. Message bus architecture, using a central broker such as ActiveMQ or Apache Qpid.
  2. "Resilient" HTTP, where some additional logic is built on HTTP to make it more resilient. Typical approaches here are Hystrix (Java), or SmartStack/Baker St (smart proxy).
  3. Point-to-point asynchronous messaging using something like NSQ, ZMQ, or Qpid Proton.

By far the most common design pattern is #2, with a little bit of #1 mixed in when a queue is desirable.

In theory, #3 offers the best of both worlds (resiliency AND scale AND performance) but the technologies are all somewhat immature. It turns out that with #2 you can get really very far (e.g., Netflix uses Hystrix everywhere).

To answer your question directly, I'd say that #1 is very rarely used as an exclusive design pattern because it creates a single bottleneck for your entire system. #1 is common for a subset of the system. For most people, I'd recommend #2 today.