API level Circuit Breaker Implementation

75 Views Asked by At

Our current service setup has ServiceZ which calls multiple other downstream services like ServiceA, ServiceB and in turn multiple API's of these downstream services assuming ApiA1(Get), ApiA2(Post), ApiA3(Get), ApiB1(Post), ApiB2(Get) and so.

What should be correct setup for circuit breaker to implement? We already have service level circuit breaker should we break it down to API level in order to avoid complete setup failure? Or should be implemented at HTTP method level?

2

There are 2 best solutions below

0
StepUp On

It seems that it is necessary to implement circuit breaker at http method level in your serviceZ. Because you have a main service which makes calls to other services such as ApiA1(Get), ApiA2(Post), ApiA3(Get), ApiB1(Post), ApiB2(Get) and so. And when some of these services fails you need to decide what circuit breaker should do.

An example can be seen here.

0
Peter Csala On

What should be correct setup for circuit breaker to implement?

The main aim of having circuit breakers is to prevent cascading failures. In other words the transient failure of a downstream system should not be propagated to the upstream systems. By concealing the failure we are actually preventing a chain reaction (domino effect).

Back to your question, it depends on a lots of factors, like:

  • How often does the upstream call the downstream's endpoints?
  • Are the downstream's endpoints called in the same frequency and do they have the same business criticality?
  • What sort of transient failure should trigger a circuit to break?
  • What will the upstream service do if the circuit is broken?
  • etc.

My main point here is that you should not evaluate your concerns in vacuum. You should have a good understanding of the communication flows and have detailed plans how to handle certain kinds of failures.

We already have service level circuit breaker should we break it down to api level in order to avoid complete setup failure?

Lets consider two different scenarios.

Scenario A: The downstream system is already having hard time to process the requests (for whatever reason).

  • If you have a single service level circuit breaker that would allow the downstream to self-heal while the upstream's outbound calls are shortcutted by the circuit.
  • If you have endpoint level circuit breakers that would allow to pass through some traffic because most probably not all circuits break at the same time. The downstream will still receive requests while it tries to self-heal.

Scenario B: For one of the endpoints the request processing takes too long so, the requests are timed out by the upstream. Other endpoints are serving request quickly.

  • If you have a single service level circuit breaker that would prevent to call the "healthy" endpoints as well. The problematic endpoint might not be business critical but several others from the "healthy" endpoints could be.
  • If you have endpoint level circuit breakers that would allow to call "healthy" endpoints by isolating the problematic one.

As you can see, different scenarios might require different solutions. It is highly depending on your requirements and what do you want to achieve.

Or should be implemented at http method level?

I think it should be clear by now that adding more circuit breakers (scoping them to be more fine-grained) depends on a lots of factors. I don't want to echo here what I have already said above, just want to extend with one more thought.

Making your system more resilient should not be done alone (only inside the upstream). Resiliency service design means that there is a predefined protocol between upstream and downstream how to handle together transient failures. In other words the upstream and downstream services can apply multiple proactive and/or reactive resilience mechanisms to withstand transient failures BUT they should agree upon how to handle them.