How to ensure consistent trace_id across microservices with OpenTelemetry and gRPC in Go

81 Views Asked by At

I'm using OpenTelemetry in a Go project to trace requests across multiple microservices. My goal is to maintain a consistent trace_id for debugging purposes across an API server (gin server) and a database service, both communicating over gRPC. Despite following OpenTelemetry and gRPC documentation for context propagation, I'm encountering an issue where trace_ids differ between the services.

I've simplified the code for brevity, focusing on the relevant OpenTelemetry and gRPC setup:

API Server Setup (Caller):

// Init router...
    s.router.Use(otelgin.Middleware(os.Getenv("APP_NAME")))
// ... Set connection
    func getDBConnection(ctx context.Context) (*grpc.ClientConn, error) {
        // Setup connection with gRPC options including OpenTelemetry interceptors
    ddTraceInterceptor := grpctrace.UnaryClientInterceptor(grpctrace.WithServiceName("apiserver"))

        return grpc.DialContext(
            ctx, // Propagate context
            "db_service_endpoint",
        grpc.WithUnaryInterceptor(ddTraceInterceptor),
            grpc.WithTransportCredentials(insecure.NewCredentials()),
            grpc.WithStatsHandler(otelgrpc.NewClientHandler()))
    }

Database Service Setup (Callee):

func main() {
    // Simplified gRPC server initialization with OpenTelemetry instrumentation
    grpcServer := grpc.NewServer(
        grpc.StatsHandler(otelgrpc.NewServerHandler()))
    // Service registration omitted for brevity
}

Trace Provider intialitzation (both server and client side):

    tp := sdkTrace.NewTracerProvider()
otel.SetTracerProvider(tp)
defer func() { _ = tp.Shutdown(ctx) }()
otel.SetTextMapPropagator(propagation.NewCompositeTextMapPropagator(propagation.TraceContext{}, propagation.Baggage{}))

Both the API server and the database service initialize OpenTelemetry with a TracerProvider and use the NewCompositeTextMapPropagator for context propagation.

Despite this setup, when tracing a request that flows from the API server to the database service, the trace_id logged in the database service does not match the trace_id from the API server.

I expected the trace_id to be consistent across these calls for end-to-end tracing. What might be causing this discrepancy, and how can I ensure the trace_id remains the same across microservices?

Additional Context:

  • Both services are standalone Go applications.
  • OpenTelemetry SDK and instrumentation versions are compatible across both services.
  • No additional middleware or interceptors that might alter the context (except data dog interceptor).
0

There are 0 best solutions below