My subscriber receive a message from GCP subscription with exactly 60 seconds delay sometimes

242 Views Asked by At

I have an issue which happends periodically in my system, but causes major problems.

We use GCP Pub/Sub and sometimes subscriber receives a message with exactly 1 minute delay. In these cases only the following metrics are really spiked:

  1. oldest_unacked_message_age
  2. delivery_latency_health_score.expired_ack_deadlines = 0
  3. expired_ack_deadlines_count

Here are the details of my subscription: subscription details

Notes:

  1. metric unacked_messages_count isn't spiked, so the load for the system is usuall.
  2. I am sure that delayed messages were sucssesfuly published in Pub/Sub and I see their correct publish_time attribute.
  3. by all others metrics I can see that system isn't overloaded and subscriber is continue to pull other messages.
  4. I print a log as soon as the message processing started in the subscriber, that's how I can see this delay.

We are using google-cloud-pubsub, spring-cloud-gcp-pubsub, proto-google-cloud-pubsub-v1 and spring-integrations client libraries to StreamingPull messages. We use gRPC protocol for this.

I assume that messages sometimes could be lost due to transient failure, but in this case they should be redelivered in 10 seconds based on my ack deadline, isn't it? Update: On the messsage, which comes with 60 second delay I found an attribute googclient_deliveryattempt=1. As I understand it means that it is not redelivery???

I also thought that the problem could be in modifyAckDeadline requests, but I don't have any custom overriding. And as I see my client library by default use DEFAULT_MAX_ACK_EXTENSION_PERIOD = 0. Update: Hovewer, we send a StreamingPullRequest with StreamAckDeadlineSeconds = 60, because this value is initialized with STREAM_ACK_DEADLINE_DEFAULT, which is 60 in client library. But comments in StreamingPullRequest says: "We need to set streaming ack deadline, but it's not useful since we'll modack to send receipt. Set to some big-ish value in case we modack late".

I expect that subscriber can receive the message immediately after it is published, or if there is a failure/lost, the message will be resend after 10-20 seconds based on my ack_deadline rather than 60 seconds.

Any advices how this could be resolved?

1

There are 1 best solutions below

6
On

The Spring libraries use the Pub/Sub client library, which overrides the ack deadline specified in the subscription and implements its own lease management. By default, the initial ack deadline is 60 seconds, which is probably why you are seeing 60 second delays on some messages. What is likely happening is that the deadline is set to 60 seconds at the beginning and then the messages are sent out from the server, but the client restarts before the message is processed. In this scenario, the message will not be redelivered for 60 seconds.

If you want to reduce this time to 10 seconds, you need to change the maximum duration for each acknowledgment extension. When using the Java client library directly, call setMaxDurationperAckExtension() in the builder. If using the Spring Subscriber Configuration, set spring.cloud.gcp.pubsub.subscriber.max-duration-per-ack-extension.