grpc unary-stream with redis pubsub - degradation with too many clients

362 Views Asked by At

We have a python grpc (grpcio with asyncio) server which performs server side streaming of data consumed from redis PUB/SUB (using aioredis 2.x) , combining up to 25 channels per stream. With low traffic everything works fine, as soon as we reach 2000+ concurrent streams , the delivery of messages start falling behind.

Some setup details and what we tried so far:

  • The client connections to GRPC are loadbalanced over kubernetes cluster with Ingress-NGINX controller, and it seems scaling (we tried 9 pods with 10 process instances each) doesn't help at all (loadbalancing is distributed evenly).

  • We are running a five node redis 7.x cluster with 96 threads per replica.

  • Connecting to redis with CLI client while GRPC falls behind - individual channels are on time while GRPC streams are falling behind

  • Messages are small in size (40B) with a variable rate anywhere between 20-200 per second on each stream.

  • Aioredis seems to be opening a new connection for each pubsub subscriber even if we're using capped connection pool for each grpc instance.

  • Memory/CPU utilisation is not dramatic as well as Network I/O, so we're not getting bottlenecked there

  • Tried identical setup with a very similar grpc server written in Rust, with similar results

1

There are 1 best solutions below

0
On

@mike_t, As you have mentioned in the comment, switching from Redis Pub/Sub to zmq has helped in resolving the issue.

ZeroMQ (also known as ØMQ, 0MQ, or zmq) is an open-source universal messaging library, looks like an embeddable networking library but acts like a concurrency framework. It gives you sockets that carry atomic messages across various transports like in-process, inter-process, TCP, and multicast.

You can connect sockets N-to-N with patterns like fan-out, pub-sub, task distribution, and request-reply. It's fast enough to be the fabric for clustered products. Its asynchronous I/O model gives you scalable multicore applications, built as asynchronous message-processing tasks.

It has a score of language APIs and runs on most operating systems.