This repository was archived by the owner on Dec 6, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 166
[otep] Propose adding env variables as context carriers to specification #258
Merged
Merged
Changes from 9 commits
Commits
Show all changes
17 commits
Select commit
Hold shift + click to select a range
5a1b39e
[otep] add env-context-baggage-carriers otep
adrielp 34848d1
[chore] rename otep on pr and fix lint errors
adrielp 229ffa5
[chore] rewrite otep after original re-opening.
adrielp ceee1bf
[chore] reformatting
adrielp 84995f1
[chore] small toc fix
adrielp 01c0925
chore: minor updates
adrielp e8be78f
[docs]: update based on feedback
adrielp 85a0e20
[docs] linting fixes
adrielp f1860e4
Update text/0258-env-context-baggage-carriers.md
adrielp 51c8c6f
chore: add updates based on feedback
adrielp bcee6cc
Merge branch 'main' into env-context-prop
adrielp b6b5412
chore: additional adjustments based on feedback
adrielp ab02453
chore: emphasize the standardization on the useage of TRACEPARENT
adrielp 48477ca
chore: change originates to is defined in
adrielp d01849b
chore: additional verbiage and clarity
adrielp 482530b
chore: syntax cleanup
adrielp 87cbc6f
Merge branch 'main' into env-context-prop
lmolkova File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,294 @@ | ||
# Environment Variable Specification for Context and Baggage Propagation | ||
|
||
This is a proposal to add Environment Variables to the OpenTelemetry | ||
specification as a carrier for context and baggage propagation between | ||
processes. | ||
|
||
## Table of Contents | ||
|
||
* [Motivation](#motivation) | ||
* [Design](#design) | ||
* [Example Context](#example-context) | ||
* [Distributed Tracing in OpenTofu Prototype Example](#distributed-tracing-in-opentofu-prototype-example) | ||
* [Core Specification Changes](#core-specification-changes) | ||
* [UNIX](#unix-limitations) | ||
* [Windows](#windows-limitations) | ||
* [Allowed Characters](#allowed-characters) | ||
* [Trade-offs and Mitigations](#trade-offs-and-mitigations) | ||
* [Case-sensitivity](#case-sensitivity) | ||
* [Security](#security) | ||
* [Prior Art and Alternatives](#prior-art-and-alternatives) | ||
* [Alternatives and why they were not chosen](#alternatives-and-why-they-were-not-chosen) | ||
* [Open Questions](#open-questions) | ||
* [Future Possibilities](#future-possibilities) | ||
|
||
## Motivation | ||
|
||
The motivation for defining the specification for context and baggage | ||
propagation by using environment variables as carriers stems from a long open | ||
issue on the OpenTelemetry Specification repository, [issue #740][issue-740]. | ||
This issue has been open for such a long time that multiple groups have gone | ||
forward in implementing their own solutions to the problem using `TRACEPARENT` | ||
and `TRACESTATE` environment variables. | ||
|
||
[Issue #740][issue-740] identifies several use cases of systems that do not | ||
communicate across bounds by leveraging HTTP communications such as: | ||
|
||
* ETL | ||
* Batch | ||
* CI/CD systems | ||
|
||
Adding arbitrary Text Map Propagation through environment variable carries into | ||
the OpenTelemetry Specification will enable distributed tracing within the | ||
above listed systems. | ||
|
||
There has already been a significant amount of [Prior Art](#prior-art) built | ||
within the industry and **within OpenTelemetry** to accomplish the immediate needs, | ||
however, OpenTelemetry at this time does not clearly define the specification | ||
adrielp marked this conversation as resolved.
Show resolved
Hide resolved
|
||
for this form of propagation. | ||
|
||
Notably, as we define semantic conventions within the [CI/CD Working Group][cicd-wg], | ||
we'll need the specification defined for the industry to be able to adopt | ||
native tracing wtihin CI/CD systems. | ||
|
||
[cicd-wg]: https://github.com/open-telemetry/community/blob/main/projects/ci-cd.md | ||
[issue-740]: https://github.com/open-telemetry/opentelemetry-specification/issues/740#issue-665588273 | ||
|
||
## Design | ||
|
||
To propagate context and baggage between parent, sibling, and child processes | ||
in systems where HTTP communication does not occur between processes, a | ||
specification using key-vaulue pairs injected into the environment can be read | ||
adrielp marked this conversation as resolved.
Show resolved
Hide resolved
|
||
and produced by an arbitrary TextMapPropagator. | ||
adrielp marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
### Example Context | ||
|
||
Consider the following diagram in the context of process forking: | ||
|
||
> Note: The diagram simply an example and simplification of process forking. | ||
adrielp marked this conversation as resolved.
Show resolved
Hide resolved
|
||
> There are other ways to spawn processes which are more performant like | ||
> exec(). | ||
|
||
<!-- TODO: Maybe change diagram to not show fork. --> | ||
|
||
 | ||
|
||
In the above diagram, a parent process is forked to spawn a child process, | ||
inheriting the environment variables from the original parent. The environment | ||
variables defined here, `TRACEPARENT`, `TRACESTATE`, and `BAGGAGE` are used to | ||
adrielp marked this conversation as resolved.
Show resolved
Hide resolved
|
||
propagate context to the child process such that it can be tied to the parent. | ||
Without `TRACEPARENT`, a tracing backend would not be able to connect the child | ||
adrielp marked this conversation as resolved.
Show resolved
Hide resolved
|
||
process spans to the parent span, forming an end-to-end trace. | ||
|
||
> Note: While the below exclusively references the W3C Specification, this | ||
> proposal is not exclusive to W3C and is instead focused on the mechanism of | ||
> Text Map Propagation with a potential set of well-known environment variable | ||
> names. | ||
|
||
`traceparent` (lowercase), originates in the [W3C Specification][w3c-parent] | ||
adrielp marked this conversation as resolved.
Show resolved
Hide resolved
adrielp marked this conversation as resolved.
Show resolved
Hide resolved
|
||
and includes the following fields: | ||
|
||
* `version` | ||
* `trace-id` | ||
* `parent-id` | ||
* `trace-flags` | ||
|
||
This could be set in the environment as follows: | ||
|
||
```bash | ||
export TRACEPARENT=version=2HEXDIGLC,trace-id=32HEXDIGLC,parent-id=16HEXDIGLC,trace-flags=2HEXDIGLC | ||
adrielp marked this conversation as resolved.
Show resolved
Hide resolved
|
||
``` | ||
|
||
`tracestate` (lowercase), originates in the [W3C Specification][w2c-state] and | ||
adrielp marked this conversation as resolved.
Show resolved
Hide resolved
adrielp marked this conversation as resolved.
Show resolved
Hide resolved
|
||
can include any opaque value in a key-value pair structure. Its goal is to | ||
provide additional vendor-specific trace information. | ||
|
||
`baggage` (lowercase), also is defined in the [W3C Specification][w3c-bag] and | ||
adrielp marked this conversation as resolved.
Show resolved
Hide resolved
|
||
is a set of key-value pairs to propagate context between signals. In | ||
OpenTelemetry, baggage is propagated through the [Baggage API][bag-api]. | ||
|
||
[w3c-parent]: https://www.w3.org/TR/trace-context-2/#traceparent-header-field-values | ||
[w3c-state]: https://www.w3.org/TR/trace-context-2/#tracestate-header | ||
[w3c-bag]: https://www.w3.org/TR/baggage/#baggage-http-header-format | ||
|
||
#### Distributed Tracing in OpenTofu Prototype Example | ||
|
||
Consider this real world example OpenTofu Controller Deployment. | ||
adrielp marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
 | ||
|
||
In this model, the OpenTofu Controller is the start of the trace, containing | ||
the actual trace_id and generating the root span. The OpenTofu controller | ||
deploys a runner which has it's own environment and processes to run OpenTofu | ||
adrielp marked this conversation as resolved.
Show resolved
Hide resolved
|
||
commands. If one was to trace these processes without a carrier mechanism, then | ||
they would all show up as unrelated root spans as separate traces. However, by | ||
adrielp marked this conversation as resolved.
Show resolved
Hide resolved
|
||
leveraging environment variables as carriers, each span is able to be tied back | ||
to the root span, creating a single trace as shown in the image of a real | ||
OpenTofu trace below. | ||
|
||
 | ||
|
||
Additionally, the `init` span is able to pass baggage to the `plan` and `apply` | ||
spans. One example of this is module version and repository information. This | ||
information is only determined and known during the `init` process. Subsequent | ||
processes only know about the module by name. With `BAGGAGE` the rest of the | ||
proccesses are able to understand a key piece of information which allows | ||
adrielp marked this conversation as resolved.
Show resolved
Hide resolved
|
||
errors to be tied back to original module version and source code. | ||
|
||
Defining the specification for Environment Variables as carriers will have a | ||
wide impact to the industry in enabling better observability to systems outside | ||
of the normal HTTP microservice architecture. | ||
|
||
[w3c-bag]: https://www.w3.org/TR/baggage/#header-name | ||
[bag-api]: https://opentelemetry.io/docs/specs/otel/baggage/api/ | ||
|
||
## Core Specification Changes | ||
|
||
The OpenTelemetry Specification should be updated with the definitions for | ||
extending context propagation into the environment through Text Map | ||
propagators. | ||
|
||
This update should include: | ||
|
||
* A common set of environment variables like `TRACEPARENT`, `TRACESTATE`, and | ||
`BAGGAGE` that can be used to propagate context between processes. | ||
* A specification for allowed environment names and values due to operating | ||
system limitations. | ||
* A specification for how implementers can inject and extract context from the | ||
environment through a TextMapPropagator. | ||
* A specification for how processes should update environment variables before | ||
spawning new processes. | ||
|
||
Defining the specification for Environment Variables as carriers for context | ||
will enable SDK's and other tools to implement getters and settings of context | ||
adrielp marked this conversation as resolved.
Show resolved
Hide resolved
|
||
in a standard, observable way. Therefore, current OpenTelemetry language | ||
maintainers will need to develop language specific implementations that adhere | ||
to the specification. | ||
|
||
Two implementations already exist within OpenTelemetry for environment | ||
variables through the TextMap Propagator: | ||
|
||
* [Python SDK][python-env] - This implementation uses environment dictionary as | ||
the carrier in Python for invoking process to invoked process context | ||
propagation. This pull request does not appear to have been merged. | ||
* [Swift SDK][swift-env] - This implementation uses `TRACEPARENT` and | ||
`TRACESTATE` environment variables alongside the w3cPropagator to inject and | ||
adrielp marked this conversation as resolved.
Show resolved
Hide resolved
|
||
extract context. | ||
|
||
Due to programming conventions, operating system limitations, prior art, and | ||
information below, it is recommended to leverage upper-cased environment | ||
variables for the carrier that align with context propagator specifications. | ||
|
||
[python-env]: https://github.com/Div95/opentelemetry-python/tree/feature/env_propagator/propagator/opentelemetry-propagator-env | ||
[swift-env]: https://github.com/open-telemetry/opentelemetry-swift/blob/main/Sources/OpenTelemetrySdk/Trace/Propagation/EnvironmentContextPropagator.swift | ||
|
||
### UNIX Limitations | ||
|
||
UNIX system utilities use upper-case for environment variables and lower-case | ||
are reserved for applications. Using upper-case will prevent conflicts with | ||
internal application variables. | ||
|
||
Environment variable names used by the utilities in the XCU specification | ||
adrielp marked this conversation as resolved.
Show resolved
Hide resolved
|
||
consist solely of upper-case letters, digits and the "_" (underscore) from the | ||
characters defined in Portable Character Set. Other characters may be permitted | ||
by an implementation; applications must tolerate the presence of such names. | ||
Upper- and lower-case letters retain their unique identities and are not folded | ||
together. The name space of environment variable names containing lower-case | ||
letters is reserved for applications. Applications can define any environment | ||
variables with names from this name space without modifying the behaviour of | ||
the standard utilities. | ||
|
||
Source: [The Open Group, The Single UNIX® Specification, Version 2, Environment Variables](https://pubs.opengroup.org/onlinepubs/7908799/xbd/envvar.html) | ||
|
||
### Windows Limitations | ||
|
||
Windows is case-insensitive with environment variables. Despite this, the | ||
recommendation to be upper-cased across OS. | ||
|
||
Some languages already do this. This [CPython issue][cpython] discusses how | ||
Python automatically upper-cases environment variables. The issue was merged and | ||
this [documentation][cpython-doc] was added to clarify the behavior. | ||
|
||
[cpython]: https://github.com/python/cpython/issues/101754 | ||
[cpython-doc]: https://docs.python.org/3/library/os.html#os.environ | ||
|
||
### Allowed characters | ||
|
||
To ensure compatibility, specification for Environment Variables SHOULD adhere | ||
to the current specification for `TextMapPropagator` where key/value pairs MUST | ||
only consist of US-ASCII characters that make up valid HTTP header fields as | ||
per RFC 7230. | ||
|
||
Environment variable keys, SHOULD NOT conflict with common known environment | ||
variables like those described in [IEEE Std 1003.1-2017][std1003]. | ||
|
||
One key note is that windows disallows the use of the `=` character in | ||
environment variable names. See [MS Env Vars][ms-env] for more information. | ||
|
||
There is also a limit on how many characters an environment variable can | ||
support which is 32,767 characters. | ||
|
||
[std1003]: https://pubs.opengroup.org/onlinepubs/9799919799/ | ||
|
||
[ms-env]: https://learn.microsoft.com/en-us/windows/win32/procthread/environment-variables | ||
|
||
## Trade-offs and Mitigations | ||
|
||
### Case-sensitivity | ||
|
||
On Windows, because environment variable keys are case insensitive, there is a | ||
chance that automatically instrumented context propagation variables could | ||
conflict with existing application environment variables. It will be important | ||
to denote this behavior and identify document how languages mitigate this | ||
adrielp marked this conversation as resolved.
Show resolved
Hide resolved
|
||
issue. | ||
|
||
### Security | ||
|
||
Do not put sensitive information in environment variables. Due to the nature of | ||
environment variables, an attacker with the right access could obtain | ||
information they should not be privy too. Additionally, the integrity of the | ||
environment variables could be compromised. | ||
|
||
## Prior Art and Alternatives | ||
adrielp marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
There are many users of `TRACEPARENT` and/or `TRACESTATE` environment variables | ||
mentioned in [opentelemetry-specification #740](https://github.com/open-telemetry/opentelemetry-specification/issues/740): | ||
|
||
* [Jenkins OpenTelemetry Plugin](https://github.com/jenkinsci/opentelemetry-plugin) | ||
* [otel-cli generic wrapper](https://github.com/equinix-labs/otel-cli) | ||
* [Maven OpenTelemetry Extension](https://github.com/cyrille-leclerc/opentelemetry-maven-extension) | ||
* [Ansible OpenTelemetry Plugin](https://github.com/ansible-collections/community.general/pull/3091) | ||
* [go-test-trace](https://github.com/rakyll/go-test-trace/commit/22493612be320e0a01c174efe9b2252924f6dda9) | ||
* [Concourse CI](https://github.com/concourse/docs/pull/462) | ||
* [BuildKite agent](https://github.com/buildkite/agent/pull/1548) | ||
* [pytest](https://github.com/chrisguidry/pytest-opentelemetry/issues/20) | ||
* [Kubernetes test-infra Prow](https://github.com/kubernetes/test-infra/issues/30010) | ||
* [hotel-california](https://github.com/parsonsmatt/hotel-california/issues/3) | ||
|
||
Additionally, there was a prototype implementation for environment variables as | ||
context carriers written in the [Python SDK][python-env]. | ||
|
||
[python-env]: https://github.com/open-telemetry/opentelemetry-specification/issues/740#issuecomment-919657003 | ||
|
||
## Alternatives and why they were not chosen | ||
|
||
### Using a file for the carrier | ||
|
||
Using a JSON file that is stored on the filesystem and referenced through an | ||
environment variable would eliminate the need to workaround case-insensitivity | ||
issues on Windows, however it would introduce a number of issues: | ||
|
||
1. Would introduce an out-of-band file that would need to be created and | ||
reliably cleaned up. | ||
2. Managing permissions on the file might be non-trivial in some circumstances | ||
(for example, if `sudo` is used). | ||
3. This would deviate from significant prior art that currently uses | ||
environment variables. | ||
|
||
## Open questions | ||
|
||
The author has no open questions at this point. | ||
|
||
## Future possibilities | ||
|
||
1. Enabling distributed tracing in systems that do not communicate over HTTP. | ||
adrielp marked this conversation as resolved.
Show resolved
Hide resolved
|
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.