-
Notifications
You must be signed in to change notification settings - Fork 311
[PIPE-1021] Propagate parent OTel trace/span from backend if provided #3348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PIPE-1021] Propagate parent OTel trace/span from backend if provided #3348
Conversation
c81b4ba
to
71d2821
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few small things/questions.
@@ -617,6 +617,13 @@ func (r *JobRunner) createEnvironment(ctx context.Context) ([]string, error) { | |||
if r.conf.AgentConfiguration.TracingBackend != "" { | |||
env["BUILDKITE_TRACING_BACKEND"] = r.conf.AgentConfiguration.TracingBackend | |||
env["BUILDKITE_TRACING_SERVICE_NAME"] = r.conf.AgentConfiguration.TracingServiceName | |||
|
|||
if r.conf.Job.TraceParent != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe a comment to explain what one mite find in this value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 1b6afee
clicommand/agent_start.go
Outdated
TracingServiceName string `cli:"tracing-service-name"` | ||
TracingBackend string `cli:"tracing-backend"` | ||
TracingServiceName string `cli:"tracing-service-name"` | ||
TracingAcceptTraceparent bool `cli:"tracing-accept-traceparent"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is ther a reason why you wouldn't want to propegate this value? Is propegate a better word?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rationale for making it opt-in is in the comment in internal/job/tracing.go
you commented on. Propagate is a better word 🧠 I'll change accept => propagate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the terminology to propagate in d6721c9
internal/job/tracing.go
Outdated
@@ -183,6 +186,69 @@ func (e *Executor) startTracingOpenTelemetry(ctx context.Context) (tracetools.Sp | |||
return tracetools.NewOpenTelemetrySpan(span), ctx, stop | |||
} | |||
|
|||
// accepting traceparent from Buildkite control plane is an opt-in feature as its technically | |||
// a breaking change to the behaviour, and if the server-side tracing isn't set up | |||
// correctly, agent traces may end up without root spans to link to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I see now, is this actualy the case though based on your code it looks like it skips this issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's mostly the breaking change thing in behavior. In some tools like Grafana it looks very clunky if you're missing the root span from the trace.
The backend will add a traceparent
property in the agent jobs accept / acquire APIs (behind a FF currently) but may not have the opentelemetry tracing notification service set up, so we only want to propagate it when they opt-in for now.
internal/job/tracing.go
Outdated
|
||
// parse W3C trace context traceparent header format | ||
// https://www.w3.org/TR/trace-context/#traceparent-header | ||
func extractTraceParent(traceParent string) (trace.TraceID, trace.SpanID, trace.TraceFlags, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you just use the inbuild parser?
otel.GetTextMapPropagator().Inject(r.Context(), propagation.HeaderCarrier(w.Header()))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooh I was trying to find an implementation of that in the otel sdk but was obviously looking in the wrong spots! I'll have a dig and get rid of this handcrafted logic!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, sorted in 4963913
- I needed use
Extract
rather thanInject
, to decorate thecontext.Context
- I couldn't use headers, because (a) its not passed as a header from the backend and (b) the whole
buildkite-agent bootstrap
dance that passes off the parameters and config to the child process, so I just used apropagation.MapCarrier
and stuffed the value where I needed it.
Refs:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 👍🏻 from me, be great to try this out 🚀 📈
Description
As part of the OpenTelemetry Tracing Notification Service work, we provide the
traceparent
from the job acquire / accept endpoints. When present, and the user has opted-in, we set the root span for the job to use the provided trace/parent span id.This is based on the W3C Tracecontext spec.
Context
PIPE-1021
Changes
Adds
--tracing-accept-traceparent
flag, that if provided will allow accepting opentelemetry tracing parent span information from the Buildkite control plane.Testing
go test ./...
). Buildkite employees may check this if the pipeline has run automatically.go fmt ./...
)