Skip to content

Propose using a different schema to represent Events in a span #37028

Open
@awangc

Description

@awangc

Component(s)

exporter/elasticsearch

Is your feature request related to a problem? Please describe.

When storing Span Events in elasticsearch, the event name becomes the key in the default mapping mode, under which different attributes are stored, e.g. if we have events with name "my-event-1", "my-event-2", then in Elasticsearch we'll have Events.my-event-1.time, Events.my-event-2.time, etc. This does not seem to follow the data format for events for a Span from opentelemetry collector, which are modeled as an array of Span_Event, in which a Span_Event will contain fields like time, name and array of attributes.

The issue I see with this approach is that if name is given arbitrary values (e.g. random UUIDs), then we could see an arbitrary increase in the number of keys, leading to mapping field explosion in Elasticsearch.

Describe the solution you'd like

Store span events as an array in elasticsearch, in which each element is an object with fields with time, name and array of attribute (with dropped attribute counts as another possible field - like the Span_Event class)

Admittedly this format may require nested objects which may bring its own performance issues, but it resembles more the data layout from opentelemetry pdata.

Describe alternatives you've considered

No response

Additional context

The schema proposed above would follow the same format for spans, e.g., we have Span.Name and Span.Attributes, and we'd have Event.Name and Event.Attributes, and more closely represents the Event as defined in opentelemetry

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions