Memory regression in opentelemetry prometheus exporter v0.57.0 version with go 1.24 #6788

ns-jvillarfernandez · 2025-05-16T10:03:58Z

Description

We've detected a memory leak / regression in one of our containers and the culprit seems to be go.opentelemetry.io/otel/exporters/prometheus v0.57.0 , or a combo change of go.opentelemetry.io/otel/exporters/prometheus v0.57.0 and go version upgrade from 1.23 to 1.24.

Our containers started to OOM after several hours running
we verified that the traffic pattern hadn't change
attaching to a running pod and using go-tool we saw that this was tied to go.opentelemetry.io/otel/exporters/prometheus v0.57.0, precisely math/rand.newSource. Please, see the commands and screenshot below

$ kubectl port-forward pod/XXXXXXX 8086:8085
Forwarding from 127.0.0.1:8086 -> 8085
Forwarding from [::1]:8086 -> 8085
Handling connection for 8086

$ go tool pprof http://localhost:8086/debug/pprof/heap
Fetching profile over HTTP from http://localhost:8086/debug/pprof/heap
Saved profile in /Users/XXXXXXX/pprof/pprof.forward.alloc_objects.alloc_space.inuse_objects.inuse_space.002.pb.gz
File: XXXXXXX
Build ID: 944b0f39f5443eb2ef822291ecd1bb226a3c768b
Type: inuse_space
Time: 2025-05-16 10:49:15 CEST
Entering interactive mode (type "help" for commands, "o" for options)

(pprof) top
Showing nodes accounting for 286.11MB, 80.53% of 355.27MB total
Dropped 159 nodes (cum <= 1.78MB)
Showing top 10 nodes out of 151
      flat  flat%   sum%        cum   cum%
  123.63MB 34.80% 34.80%   123.63MB 34.80%  math/rand.newSource (inline)
   61.10MB 17.20% 52.00%    61.10MB 17.20%  go.opentelemetry.io/otel/sdk/metric/exemplar.newStorage (inline)
   31.04MB  8.74% 60.73%    31.04MB  8.74%  go.opentelemetry.io/otel/sdk/metric/internal/aggregate.reset[go.shape.struct { FilteredAttributes []go.opentelemetry.io/otel/attribute.KeyValue; Time time.Time; Value go.shape.int64; SpanID []uint8 "json:\",omitempty\""; TraceID []uint8 "json:\",omitempty\"" }]
   25.04MB  7.05% 67.78%    25.04MB  7.05%  go.opentelemetry.io/otel/sdk/metric/internal/aggregate.reset[go.shape.struct { FilteredAttributes []go.opentelemetry.io/otel/attribute.KeyValue; Time time.Time; Value go.shape.float64; SpanID []uint8 "json:\",omitempty\""; TraceID []uint8 "json:\",omitempty\"" }]
   14.56MB  4.10% 71.88%    14.56MB  4.10%  bufio.NewWriterSize
    7.03MB  1.98% 73.86%     7.03MB  1.98%  bufio.NewReaderSize
       7MB  1.97% 75.83%    17.50MB  4.93%  go.opentelemetry.io/otel/exporters/prometheus.addExemplars[go.shape.int64]
    6.19MB  1.74% 77.57%    12.71MB  3.58%  io.copyBuffer
    5.52MB  1.55% 79.12%     5.52MB  1.55%  bytes.growSlice
       5MB  1.41% 80.53%        5MB  1.41%  go.opentelemetry.io/otel/attribute.computeDistinctFixed

Using go tool web

Environment

OS: Linux
Architecture: x86_64
Go Version: 1.254
opentelemetry-go version: v0.57.0

Steps To Reproduce

Use go 1.24 and go.opentelemetry.io/otel/exporters/prometheus v0.57.0
Leave the container running with traffic for several hours and scrapping metrics
Memory pattern shows a clear memory increase over time and it get to the limit and gets OOMkilled

Expected behavior

No memory leak

The text was updated successfully, but these errors were encountered:

ns-jvillarfernandez · 2025-05-19T07:54:12Z

Could be related with #6732

dmathieu · 2025-05-19T08:19:03Z

6732 hasn't been released yet, it's not included in 0.57.0.

ns-obaro · 2025-05-19T08:25:08Z

6732 hasn't been released yet, it's not included in 0.57.0.

When is the new release schedule? thanks!

dmathieu · 2025-05-19T08:26:02Z

See #6793

pree-dew · 2025-05-20T07:28:56Z

@dmathieu Can I pick this task?

ns-jvillarfernandez added the bug Something isn't working label May 16, 2025

dmathieu assigned pree-dew May 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Memory regression in opentelemetry prometheus exporter v0.57.0 version with go 1.24 #6788

Memory regression in opentelemetry prometheus exporter v0.57.0 version with go 1.24 #6788

ns-jvillarfernandez commented May 16, 2025 •

edited

Loading

ns-jvillarfernandez commented May 19, 2025

Uh oh!

dmathieu commented May 19, 2025

Uh oh!

ns-obaro commented May 19, 2025

Uh oh!

dmathieu commented May 19, 2025

Uh oh!

pree-dew commented May 20, 2025

Uh oh!

Memory regression in opentelemetry prometheus exporter v0.57.0 version with go 1.24 #6788

Memory regression in opentelemetry prometheus exporter v0.57.0 version with go 1.24 #6788

Comments

ns-jvillarfernandez commented May 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Environment

Steps To Reproduce

Expected behavior

ns-jvillarfernandez commented May 19, 2025

Uh oh!

dmathieu commented May 19, 2025

Uh oh!

ns-obaro commented May 19, 2025

Uh oh!

dmathieu commented May 19, 2025

Uh oh!

pree-dew commented May 20, 2025

Uh oh!

ns-jvillarfernandez commented May 16, 2025 •

edited

Loading