Applying tail sampling policy on two seprate collectors- Aggregating spans

36 Views Asked by At

I'm using 2 separate collectors to send traces from my frontend and backend apps. One collector is receiving traces on port 4318 from the frontend app and the other collector is receiving traces on port 4317 from a backend Java app. Both services send the traces to zipkin.

The trace from the frontend app acts as a parent span for all traces coming from the backend app. I'm running both services on my laptop for this analysis.

I wanted to investigate if we use 2 separate collectors and apply tail sampling to both services, do we miss any child spans (spans from the backend app) from the root span (span from my frontend app). I apply probabilistic sampling for the tail sampling.

I used a collector in the cloud to send/process traces from the backend service and used a local collector to send/process traces from the frontend app. I noticed when the probabilistic sampling rate for the backend app is equal or higher than what we set in the collector for the frontend app, the traces include all child spans from the backend service. But when the sampling rate for the backend app is less than what I set for the frontend app, some traces miss child spans from the backend.

Since we're using 2 separate collectors, I'm wondering how a tarce from the frontend app is smart to have all spans from the backend app with appropriate sampling rate config? as an example, when I set 10% tail probabilistic sampling for the frontend app and 10% for the backend service on 2 separate collectors, all the traces from the frontend app that pass the 10% sampling rate, contain all the child spans from the backend service!

If I was using the same collector and apply the sampling rate, I could explain the collector is smart enough to apply the sampling to the root trace and let all child spans for that root trace to not being filter but I can't think of any reason why this is happening on 2 different collectors.

I generated a lot of spans and I can confirm this is not just by chance that we're receiving all child spans from the backend for a root trace from frontend.

Does this make sense?

from what I read from the OTEL community, when we use 2 different collectors and we apply tail sampling, there is a chance that we lose some child spans but I'm not observing this.

And on top of that if the collectors are smart enough why do we need a load balancer when we apply tail sampling?

0

There are 0 best solutions below