Go gRPC client closing streams prematurely

173 Views Asked by At

We have three Go micro-services in AWS Fargate: "A", "B" and "C".

The micro-services use gRPC for communication and in front of "B" and "C" there is a AWS ALB load balancer.

The request flow goes as follows: "A" -> "B" -> "C".

All the micro-services have Go version: "1.20.4" and gRPC version: "1.59".

Everything was working fine up until 10 days ago when all of a sudden the requests from "B" to "C" were broken with the following error:

======== Request1 ======== ****  Client Connection: **** .... ****  Received http2 header from 
client: "content-length: 27" <-------------------------- Client explicitly states there are 27 
bytes of data in the request .... ****  Received Request client: ****, server: ****, request: 
"POST /grpc.**** HTTP/2.0", host:**** ... ****  Client closed stream prematurely: only 23 out of 
27 bytes of request body received, client: ****, server:****, request: "POST /grpc.**** HTTP/2.0",
 host: **** <-------------------------- Client unexpectedly closes the connection after sending 
23/27 bytes of data in the request ... **** ALB generated http response: 400 **** <---------------
----------- Because the client closes the connection before sending the full request body the ALB
 emits HTTP 400

We tried using Wireshark to debug further and found:

"TCP previous segment not captured"

We have EC2 for testing, so we tried running the request through the VM where we've hosted service "B", bypassing "A" and it works fine.

Then we setup micro-service "A" on the VM (keeping "B" and "C" in Fargate) and the request is able to reach micro-service "C" only the first time, then it fails again.

Every time we reset "A", only the first request is able to reach "C". It sounds like a code issue in service "A", but there is no change there.

We then found that around the same time of our initial failed requests, AWS and Go made some changes due to the "HTTP/2 Rapid Reset Attack":

CVE-2023-44487 - HTTP/2 Rapid Reset Attack

Vulnerability Report: GO-2023-2102

So based on the "Vulnerability Report: GO-2023-2102", we increased the Go versions of the micro-services from "1.20.4" to "1.20.10".

The requests were still not reaching micro-service "C". We tried increasing the "MaxConcurrentStreams" server option to "10000" on micro-services "B" and "C", but still no luck.

1

There are 1 best solutions below

1
On

Based on the technologies used, I assume the way of using the Golang context is not right. Especially, if you are using metadata for the GRPC protocol.