Skip to content

add simple span processor #304

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 6, 2021

Conversation

tsloughter
Copy link
Member

The Simple Span Processor does a blocking call when ending a span
that returns after the span is exported.

@codecov
Copy link

codecov bot commented Nov 3, 2021

Codecov Report

Merging #304 (49b563c) into main (c47a3f2) will increase coverage by 0.59%.
The diff coverage is 56.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #304      +/-   ##
==========================================
+ Coverage   37.36%   37.95%   +0.59%     
==========================================
  Files          41       49       +8     
  Lines        3198     3322     +124     
==========================================
+ Hits         1195     1261      +66     
- Misses       2003     2061      +58     
Flag Coverage Δ
api 65.14% <ø> (-3.88%) ⬇️
elixir 14.59% <ø> (?)
erlang 37.97% <56.66%> (+0.60%) ⬆️
exporter 19.75% <ø> (ø)
sdk 76.43% <56.66%> (-0.45%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
apps/opentelemetry/src/otel_utils.erl 0.00% <0.00%> (ø)
apps/opentelemetry/src/otel_exporter.erl 25.00% <25.00%> (ø)
apps/opentelemetry/src/otel_simple_processor.erl 69.81% <69.81%> (ø)
apps/opentelemetry/src/opentelemetry_sup.erl 100.00% <100.00%> (ø)
apps/opentelemetry/src/otel_batch_processor.erl 74.11% <100.00%> (+13.05%) ⬆️
apps/opentelemetry_api/lib/open_telemetry/span.ex 26.31% <0.00%> (ø)
...pps/opentelemetry_api/lib/open_telemetry/tracer.ex 37.50% <0.00%> (ø)
apps/opentelemetry_api/lib/open_telemetry.ex 21.42% <0.00%> (ø)
apps/opentelemetry_api/lib/open_telemetry/ctx.ex 70.00% <0.00%> (ø)
...ps/opentelemetry_api/lib/open_telemetry/baggage.ex 37.50% <0.00%> (ø)
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c47a3f2...49b563c. Read the comment docs.

@tsloughter
Copy link
Member Author

Opened as a draft since it still needs some docs and I don't like that both processors are started in the supervisor no matter what (but that may be acceptable for now).

@tsloughter tsloughter requested a review from a team November 3, 2021 23:45
@tsloughter tsloughter force-pushed the simple-processor branch 2 times, most recently from 1a923e9 to 6d52e7b Compare November 4, 2021 20:20
@tsloughter tsloughter marked this pull request as ready for review November 4, 2021 20:20
The Simple Span Processor does a blocking call when ending a span
that returns after the span is exported.
{keep_state, Data#data{runner_pid=RunnerPid,
current_from=From,
handed_off_table=OldTableName},
[{state_timeout, ExportingTimeout, exporting_timeout}]};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implies that the events have individual timeouts for their send time, which ignores the queuing time. Is this accurate? I'm unfamiliar with the spec here.

But let's assume this is correct, then the desired behaviour, if I have process A and processes B sending a synchronous span at once, and both hit a connection timeout of 2 seconds, would mean that A would wait ~2 seconds, and B would wait ~4 seconds before returning.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is unrelated to the queue time, correct. This is a configuration on the processor to kill the exporter if it takes too long. I punted on a client side timeout that can be passed for now -- I think its possible to do it decently now with send_request or whatever it is that ensures old replies aren't a problem.

{next_state, idle, Data#data{current_from=undefined,
runner_pid=undefined,
handed_off_table=undefined},
[{reply, From, ok}]}.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess here we can already have sent the timeout result. The gen_statem module however supports a way to use a proxy process to hide the dirty messages:

https://github.com/erlang/otp/blob/b3d4affcf9880255f6edc2e67095015e6ed2aca2/lib/stdlib/src/gen_statem.erl#L591-L596

Doing so would recommend using the option {clean_timeout, infinity} in gen_statem:call/3, which would set up a proxy process to ignore the duplicate messages. You might want to either use that, or make a special non-replying complete_exporting/1 function acknowledge timeout races and prevents sending two replies to a single call.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, thanks. I might use that, but this made me realize, don't I just need to set current_from to undefined in the timeout reply (which it currently isn't doing and is a bug)?

Then complete_exporting can not send a reply if the current_from is undefined.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, not right now, unless you have a way to either cancel the state timeout, or to change the state. As this is implemented right now, we set the timer, complete, stay in the same state.

So we're likely to hit an invalid leftover state timeout if there is a decent time between requests, such that sending one span, succeeding, responding okay, then waiting N seconds will still trigger a timeout message. The timeout is never cancel, the state never changes, so the state timeout should fire at some point. I'm not 100% what's the behaviour if you set a state timeout, then set a new one, whether that cancels the old state timeout or just adds more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait no, it repeats the state and is supposed to retry, so it can't set current_from to undefined... grr

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, realized that after replying.

Setting the state timeout does cancel and create a new one though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bleh, also, why am I sending an error if it repeats. Need to either not repeat or not return an error.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to transition to idle.

repeat was used in batch processor to do another export of newly ended spans that have been waiting while the current exporting was blocking them from being exported.

I realize now that might be the wrong way to go about since there should be an export timeout message in the queue since we postpone it if in the exporting state, so there should be no need to repeat the state. But I'll want to verify my thinking again on that and send a separate PR.

Since the simple_processor is only exporting a single span and
does so based on a blocking call when a span ends there is no
need to do a `repeat_state` after the exporting time out fails.
The next export will be done if there is a span end message in
the processor's queue.
@tsloughter tsloughter merged commit b77babd into open-telemetry:main Nov 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants