Skip to content

Metrics: implement OTLP and its exporter #41

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
May 23, 2025
Merged

Conversation

inge4pres
Copy link
Collaborator

Reason for this PR

Note

This is a big PR
Reviewing it commit by commit might be easier

Part of #15: adds OTLP protocol implementation and its exporter counterpart, consuming the protocol.

Details

We are restricting the implementation to the bare minimum that satisfies the specification:

  • OTLP configuration via env vars
  • OTLP only using HTTP as transport (no gRPC, as it does not seem to be yet full support for it in protobuf-zig, but I may be wrong)
  • OTLP has retries for HTTP responses, following the specs
  • The exporter relies on the same MetricExporter design previously delivering the other stable exporters

Some miscellaneous additions and fixes arose from getting the protobuf conversion running, they are included in this PR.

Added context

There is not great test coverage for the OTLP exporter, mainly because I am struggling to find a decent way of mocking the client.
The goal for this PR is to start filling in the gaps, implementing the code.
I will raise another issue wrt to testing OTLP, possibly with some integration tests or in the examples.

In the issue I'll mention that if we're able to spin up a Collector test harness (using the opentelemetry/proto/collector protobuf definitions, testing might be easier.

On top of that, the Collector messages code-generation will have to be done anyway to handle Partial success and Failure messages (which are now simply logged).

@@ -90,13 +90,15 @@ test "exporters/in_memory" {

try underTest.append(allocator, Measurements{
.meterName = "first-meter",
.meterVersion = "1.0",
Copy link
Collaborator Author

@inge4pres inge4pres May 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be made unnecssary once we work on #38

@@ -18,7 +19,94 @@ const ManagedString = protobuf.ManagedString;
const pbcommon = @import("../../../opentelemetry/proto/common/v1.pb.zig");
const pbmetrics = @import("../../../opentelemetry/proto/metrics/v1.pb.zig");

pub fn toProtobufMetric(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file used to be a helper for converting the internal metrics data model to the protobuf types.
It is now moved to an implementation of an exporter (see where we call it in sdk/metrics/exporters.zig).

Comment on lines +181 to +184
var a = try std.ArrayList(pbmetrics.NumberDataPoint).initCapacity(allocator, data_points.len);
for (data_points) |dp| {
const attrs = try attributesToProtobufKeyValueList(allocator, dp.attributes);
const number_dp = pbmetrics.NumberDataPoint{
a.appendAssumeCapacity(pbmetrics.NumberDataPoint{
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small perf improvement: save allocations as we already know the number of items for the list.

try std.testing.expectEqual(payload.len, 0);
}

/// Configuration options for the OTLP transport.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +507 to +525
// Implements the Exponential Backoff Retry strategy for HTTP requests.
// The default configuration values are not dictated by the OTLP spec.
// We choose toset a maxumum number of retries, even if not specified in the spec,
// to avoid infinite loops in case of a un-responsive server.
const ExpBackoffRetry = struct {
const Self = @This();

allocator: std.mem.Allocator,
client: *http.Client,

max_retries: u32,
base_delay_ms: u64,
max_delay_ms: u64,

queue: std.PriorityQueue(FetchRequest, void, compareFetchRequest),
thread: ?std.Thread,
mutex: std.Thread.Mutex,
condition: std.Thread.Condition,
running: std.atomic.Value(bool),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this to have a more structured approach to retries.
Turns out that it's too complex, and I'm ditching it in #47 tosimply fire off a new thread for each request that needs to be retried.

Alas, you don't have to review this part.

Comment on lines +599 to +600
while (self.running.load(.acquire)) {
var req: FetchRequest = undefined;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a deadlock here: since we .acquire the atomic in each loop iteration, but never release it, it only does the first retry and blocks forever.

const pbtrace = @import("opentelemetry/proto/trace/v1.pb.zig");

/// Export the data to the OTLP endpoint using the options configured with ConfigOptions.
pub fn Export(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the intended entrypoint for the exporter to call into the OTLP transport.

@inge4pres inge4pres force-pushed the metrics/stable-exporters-2 branch from 903e852 to 68590a9 Compare May 18, 2025 12:39
@inge4pres inge4pres force-pushed the metrics/stable-exporters-2 branch from 68590a9 to 5f679a0 Compare May 18, 2025 12:47
@inge4pres
Copy link
Collaborator Author

Rebased and fixed conflicts, please review 🙏🏼

Copy link
Collaborator

@devatbosch devatbosch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @inge4pres , the changes looks fine to me. Hence, approving from my side...

Copy link
Member

@kakkoyun kakkoyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've skimmed. I don't have any objections.
I will have more in-depth review, hopefully soon.

Thanks for all the hardwork!

@inge4pres inge4pres merged commit db78740 into main May 23, 2025
4 checks passed
@inge4pres inge4pres deleted the metrics/stable-exporters-2 branch May 23, 2025 08:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request metrics proto
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants