-
Notifications
You must be signed in to change notification settings - Fork 330
[feature request] Add PartialActivityProcessor to provide additional visibility to very long-running processes #2682
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Tagging component owner(s). |
I'm not sure that this is something that we should be presenting as a generalised approach and providing tooling around. It's a known mode of operation that spans are only generated at the end (due to the requirement for start/end times). This would be a good usecase for the new Events sub-signal, but I wouldn't advise building artificial spans from the events, it would be upto the observability backend to represent that data in a useful way. |
We raised this issue on the dotnet community call yesterday April 8. Here is a summary of what was discussed.
|
PartialActivityProcessor standalone repo https://github.com/G-Research/otel-partial-dotnet |
Uh oh!
There was an error while loading. Please reload this page.
Component
OpenTelemetry.Extensions
Is your feature request related to a problem?
When you have very long running processes, there is no visibility until a span is ended. A process can run for hours and in the meantime you don't know whether it's still active or not. Additionally, in case of an ungraceful shutdown, the spans are never pushed to the collector, resulting in missing crucial data.
What is the expected behavior?
We have implemented a PartialSpanProcessor that exports the Spans as logs over gRPC. The processor exports a log during the span startup, a log during the span end and heartbeat logs while the span is active. The heartbeat logs are exported on a configurable interval.
That way we can have a log on the desired backend, every 5 or 10 or 15 minutes and be aware of the process and span state. It has been tested and it works as expected.
Which alternative solutions or features have you considered?
No response
Additional context
There is a quite old issue on the otel spec regarding periodically exporting active spans in order to gain some sort of visibility for very long running processes.
open-telemetry/opentelemetry-specification#373
There are many problems with exporting partial spans and the issue remains open.
We discussed the issue on the Spec community call, a couple months ago and the suggested solution was to periodically export logs or events. Also, the
java-contrib
has a package with processors that have been doing data conversions.For more info, check this comment and the ones below it
open-telemetry/opentelemetry-specification#373 (comment)
It was suggested that the individual sdk contrib repos are a good start to contribute such a processor so that people can start using it, experiment with it and provide feedback. If it turns out to be something useful then it might end up in the spec and the main repos.
At G-Research, we've also implemented a custom collector that acts as both a receiver and an exporter so that we can gather the logs, filter them and query the results.
We already have this implemented as a standalone extension of dotnet sdk, but we'd like to contribute this back to a community. We're open to suggestions and further discussions regarding our PartialActivityProcessor.
The text was updated successfully, but these errors were encountered: