Skip to content

[exporter/elasticsearch] Dynamically route documents by default #38500

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 42 commits into from
Mar 14, 2025
Merged
Show file tree
Hide file tree
Changes from 28 commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
3f35e80
Update readme
carsonip Mar 7, 2025
365d2d6
Add legacy config
carsonip Mar 7, 2025
9287519
Add config validation
carsonip Mar 7, 2025
73f16b3
Switch default *_dynamic_index to true
carsonip Mar 7, 2025
0e27f5d
Add elasticsearch index attr
carsonip Mar 10, 2025
c820266
Remove legacy prefix/suffix
carsonip Mar 10, 2025
956f573
Do not encode index attribute in otel mode
carsonip Mar 10, 2025
f1c544e
Test span events
carsonip Mar 10, 2025
3378e02
Ignore dynamic index config
carsonip Mar 10, 2025
4265639
Update readme
carsonip Mar 10, 2025
fd4b63a
Update validation
carsonip Mar 10, 2025
963fb57
Add deprecation warning
carsonip Mar 10, 2025
22b7952
Update readme
carsonip Mar 10, 2025
bdb75d9
Remove unused prefix/suffix
carsonip Mar 10, 2025
f5d1455
Rename elasticsearch._index to elasticsearch.index
carsonip Mar 10, 2025
6870aee
Update docs to be specific that span events are separate documents in…
carsonip Mar 11, 2025
68ac209
Update test name
carsonip Mar 11, 2025
bbc0b4d
Update readme
carsonip Mar 11, 2025
3f6d99e
Refactor newDocumentRouter
carsonip Mar 11, 2025
99db8e6
Add otel mode span event routing test
carsonip Mar 11, 2025
61a3bfa
Send span events to logs_index instead of traces_index if configured
carsonip Mar 11, 2025
b9a944e
Make linter happy
carsonip Mar 11, 2025
0187945
Clarify docs
carsonip Mar 11, 2025
232a98a
Add links
carsonip Mar 11, 2025
6aaa5d0
Document remove attr
carsonip Mar 11, 2025
c079201
Add changelog
carsonip Mar 11, 2025
8338d13
Update deprecation warning
carsonip Mar 11, 2025
d78af32
Not use deprecated config in bench test
carsonip Mar 11, 2025
5c9c85a
Fix receiver based routing overwriting data_stream.dataset
carsonip Mar 11, 2025
769edb2
Address review comments
carsonip Mar 12, 2025
52b89fa
Merge branch 'main' into ds-routing
carsonip Mar 12, 2025
bb44b7d
Update exporter/elasticsearchexporter/README.md
carsonip Mar 12, 2025
4837cd3
Update exporter/elasticsearchexporter/README.md
carsonip Mar 12, 2025
0d86022
Update exporter/elasticsearchexporter/README.md
carsonip Mar 12, 2025
fdc0087
Update exporter/elasticsearchexporter/README.md
carsonip Mar 12, 2025
ed8d857
Update exporter/elasticsearchexporter/README.md
carsonip Mar 12, 2025
49139fc
Apply suggestions from code review
carsonip Mar 12, 2025
378f81f
Update stale index config comment
carsonip Mar 12, 2025
b07eb95
Update otel data mode exceptions
carsonip Mar 12, 2025
91112ab
Mention in addition
carsonip Mar 12, 2025
7f34bdd
Merge branch 'main' into ds-routing
carsonip Mar 13, 2025
87c9593
Merge branch 'main' into ds-routing
carsonip Mar 14, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions .chloggen/elasticsearchexporter_dynamic-routing-default.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Use this changelog template to create an entry for release notes.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: breaking

# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
component: elasticsearchexporter

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: Dynamically route documents by default unless `{logs,metrics,traces}_index` is non-empty

# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
issues: [38361]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext:
Overhaul in document routing.
Deprecate and make `{logs,metrics,traces}_dynamic_index` config no-op.
Config validation error on `{logs,metrics,traces}_dynamic_index::enabled` and `{logs,metrics,traces}_index` set at the same time, as users who rely on dynamic index should not set `{logs,metrics,traces}_index`.
Remove `elasticsearch.index.{prefix,suffix}` handling. Replace it with `elasticsearch.index` handling that uses attribute value as index directly. Users rely on the previously supported `elasticsearch.index.prefix` and `elasticsearch.index.suffix` should migrate to a transform processor that sets `elasticsearch.index`.

# If your change doesn't affect end users or the exported elements of any package,
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
# Optional: The change log or logs in which this entry should be included.
# e.g. '[user]' or '[user, api]'
# Include 'user' if the change is relevant to end users.
# Include 'api' if there is a change to a library API.
# Default: '[user]'
change_logs: [user]
46 changes: 27 additions & 19 deletions exporter/elasticsearchexporter/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,33 +118,41 @@ Using the common `batcher` functionality provides several benefits over the defa

### Elasticsearch document routing

Telemetry data will be written to signal specific data streams by default:
logs to `logs-generic-default`, metrics to `metrics-generic-default`, and traces to `traces-generic-default`.
Documents are statically or dynamically routed to the target index / data stream in the following order. The first routing mode that applies will be used.
1. "Static mode": To `logs_index` for log records, `metrics_index` for data points and `traces_index` for spans, if these configs are not empty respectively. In OTel mapping mode (`mapping::mode: otel`), span events are separate documents routed to `logs_index` if non-empty.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps instead of repeating the statement about otel mode span events, add a paragraph after the numbered list mentioning that in otel mode, span events are considered log records and routed as such?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 769edb2

2. "Dynamic - Index attribute mode": To index name in `elasticsearch.index` attribute (precedence: log record / data point / span attribute [^3] > scope attribute > resource attribute) if the attribute exists. In OTel mapping mode (`mapping::mode: otel`), span events are separate documents routed according to span events attributes, not span attributes.
3. "Dynamic - Data stream routing mode": To data stream constructed from `${data_stream.type}-${data_stream.dataset}-${data_stream.namespace}`,
where `data_stream.type` is `logs` for log records, `metrics` for data points, and `traces` for spans.
In OTel mapping mode (`mapping::mode: otel`), span events are separate documents that have `data_stream.type: logs` and are routed using span event attributes, not span attributes.
Note that in OTel mapping mode, `data_stream.dataset` will always be appended with `.otel`.
In a special case with `mapping::mode: bodymap`, `data_stream.type` field (valid values: `logs`, `metrics`) can be dynamically set from attributes.
The resulting docs will contain the corresponding `data_stream.*` fields, see restrictions applied to [Data Stream Fields](https://www.elastic.co/guide/en/ecs/current/ecs-data_stream.html).
1. `data_stream.dataset` or `data_stream.namespace` in attributes (precedence: log record / data point / span attribute [^3] > scope attribute > resource attribute)
2. Otherwise, if scope name matches regex `/receiver/(\w*receiver)`, `data_stream.dataset` will be capture group #1
3. Otherwise, `data_stream.dataset` falls back to `generic` and `data_stream.namespace` falls back to `default`.

[^3]: Additionally, span event attribute in OTel mode

This can be customised through the following settings:

- `logs_index`: The [index] or [data stream] name to publish events to. The default value is `logs-generic-default`
- `logs_index` (optional): The [index] or [data stream] name to publish logs (and span events in OTel mapping mode) to. `logs_index` should be empty unless all documents should be sent to the same index.

- `logs_dynamic_index` (optional): uses resource, scope, or log record attributes to dynamically construct index name.
- `enabled`(default=false): Enable/Disable dynamic index for log records. If `data_stream.dataset` or `data_stream.namespace` exist in attributes (precedence: log record attribute > scope attribute > resource attribute), they will be used to dynamically construct index name in the form `logs-${data_stream.dataset}-${data_stream.namespace}`. In a special case with `mapping::mode: bodymap`, `data_stream.type` field (valid values: `logs`, `metrics`) is also supported to dynamically construct index in the form `${data_stream.type}-${data_stream.dataset}-${data_stream.namespace}`. Otherwise, if
`elasticsearch.index.prefix` or `elasticsearch.index.suffix` exist in attributes (precedence: resource attribute > scope attribute > log record attribute), they will be used to dynamically construct index name in the form `${elasticsearch.index.prefix}${logs_index}${elasticsearch.index.suffix}`. Otherwise, if scope name matches regex `/receiver/(\w*receiver)`, `data_stream.dataset` will be capture group #1. Otherwise, the index name falls back to `logs-generic-default`, and `logs_index` config will be ignored. Except for prefix/suffix attribute presence, the resulting docs will contain the corresponding `data_stream.*` fields, see restrictions applied to [Data Stream Fields](https://www.elastic.co/guide/en/ecs/current/ecs-data_stream.html).
- `enabled`(DEPRECATED): No-op. Documents are now always routed dynamically unless `logs_index` is not empty. Will be removed in a future version.

- `metrics_index` (optional): The [index] or [data stream] name to publish metrics to. The default value is `metrics-generic-default`.
⚠️ Note that metrics support is currently in development.
- `metrics_index` (optional): The [index] or [data stream] name to publish metrics to. `metrics_index` should be empty unless all documents should be sent to the same index. Note that metrics support is currently in development.

- `metrics_dynamic_index` (optional): uses resource, scope or data point attributes to dynamically construct index name.
⚠️ Note that metrics support is currently in development.
- `enabled`(default=true): Enable/disable dynamic index for metrics. If `data_stream.dataset` or `data_stream.namespace` exist in attributes (precedence: data point attribute > scope attribute > resource attribute), they will be used to dynamically construct index name in the form `metrics-${data_stream.dataset}-${data_stream.namespace}`. Otherwise, if
`elasticsearch.index.prefix` or `elasticsearch.index.suffix` exist in attributes (precedence: resource attribute > scope attribute > data point attribute), they will be used to dynamically construct index name in the form `${elasticsearch.index.prefix}${metrics_index}${elasticsearch.index.suffix}`. Otherwise, if scope name matches regex `/receiver/(\w*receiver)`, `data_stream.dataset` will be capture group #1. Otherwise, the index name falls back to `metrics-generic-default`, and `metrics_index` config will be ignored. Except for prefix/suffix attribute presence, the resulting docs will contain the corresponding `data_stream.*` fields, see restrictions applied to [Data Stream Fields](https://www.elastic.co/guide/en/ecs/current/ecs-data_stream.html).
- `enabled`(DEPRECATED): No-op. Documents are now always routed dynamically unless `metrics_index` is not empty. Will be removed in a future version.

- `traces_index`: The [index] or [data stream] name to publish traces to. The default value is `traces-generic-default`.
- `traces_index` (optional): The [index] or [data stream] name to publish traces to. `traces_index` should be empty unless all documents should be sent to the same index.

- `traces_dynamic_index` (optional): uses resource, scope, or span attributes to dynamically construct index name.
- `enabled`(default=false): Enable/Disable dynamic index for trace spans. If `data_stream.dataset` or `data_stream.namespace` exist in attributes (precedence: span attribute > scope attribute > resource attribute), they will be used to dynamically construct index name in the form `traces-${data_stream.dataset}-${data_stream.namespace}`. Otherwise, if
`elasticsearch.index.prefix` or `elasticsearch.index.suffix` exist in attributes (precedence: resource attribute > scope attribute > span attribute), they will be used to dynamically construct index name in the form `${elasticsearch.index.prefix}${traces_index}${elasticsearch.index.suffix}`. Otherwise, if scope name matches regex `/receiver/(\w*receiver)`, `data_stream.dataset` will be capture group #1. Otherwise, the index name falls back to `traces-generic-default`, and `traces_index` config will be ignored. Except for prefix/suffix attribute presence, the resulting docs will contain the corresponding `data_stream.*` fields, see restrictions applied to [Data Stream Fields](https://www.elastic.co/guide/en/ecs/current/ecs-data_stream.html). There is an exception for span events under OTel mapping mode (`mapping::mode: otel`), where span event attributes instead of span attributes are considered, and `data_stream.type` is always `logs` instead of `traces` such that documents are routed to `logs-${data_stream.dataset}-${data_stream.namespace}`.
- `enabled`(DEPRECATED): No-op. Documents are now always routed dynamically unless `traces_index` is not empty. Will be removed in a future version.

- `logstash_format` (optional): Logstash format compatibility. Logs, metrics and traces can be written into an index in Logstash format.
- `enabled`(default=false): Enable/disable Logstash format compatibility. When `logstash_format.enabled` is `true`, the index name is composed using `(logs|metrics|traces)_index` or `(logs|metrics|traces)_dynamic_index` as prefix and the date as suffix,
e.g: If `logs_index` or `logs_dynamic_index` is equal to `logs-generic-default`, your index will become `logs-generic-default-YYYY.MM.DD`.
- `enabled`(default=false): Enable/disable Logstash format compatibility. When `logstash_format.enabled` is `true`, the index name is composed using the above dynamic routing rules as prefix and the date as suffix,
e.g: If the computed index name is `logs-generic-default`, the resulting index will be `logs-generic-default-YYYY.MM.DD`.
The last string appended belongs to the date when the data is being generated.
- `prefix_separator`(default=`-`): Set a separator between logstash_prefix and date.
- `date_format`(default=`%Y.%m.%d`): Time format (based on strftime) to generate the second part of the Index name.
Expand Down Expand Up @@ -188,12 +196,12 @@ and `data_stream.namespace`. Instead of serializing these values under the `*att
they are put at the root of the document, to conform with the conventions of the data stream naming
scheme that maps these as `constant_keyword` fields.

`data_stream.dataset` will always be appended with `.otel`. It is recommended to use with
`*_dynamic_index::enabled: true` (e.g. `logs_dynamic_index::enabled`) to route documents to data stream
`${data_stream.type}-${data_stream.dataset}-${data_stream.namespace}`.
`data_stream.dataset` will always be appended with `.otel` if [dynamic data stream routing mode](#elasticsearch-document-routing) is active.

Span events are stored in separate documents. They will be routed with `data_stream.type` set to
`logs` if `traces_dynamic_index::enabled` is `true`.
`logs` if [dynamic data stream routing mode](#elasticsearch-document-routing) is active.

Attribute `elasticsearch.index` will be removed from the final document if exists.

| Signal | Supported |
| --------- | ------------------ |
Expand Down
2 changes: 0 additions & 2 deletions exporter/elasticsearchexporter/attribute.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,6 @@ import "go.opentelemetry.io/collector/pdata/pcommon"

// dynamic index attribute key constants
const (
indexPrefix = "elasticsearch.index.prefix"
indexSuffix = "elasticsearch.index.suffix"
defaultDataStreamDataset = "generic"
defaultDataStreamNamespace = "default"
defaultDataStreamTypeLogs = "logs"
Expand Down
31 changes: 25 additions & 6 deletions exporter/elasticsearchexporter/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -41,18 +41,15 @@ type Config struct {
NumWorkers int `mapstructure:"num_workers"`

// This setting is required when logging pipelines used.
LogsIndex string `mapstructure:"logs_index"`
// fall back to pure LogsIndex, if 'elasticsearch.index.prefix' or 'elasticsearch.index.suffix' are not found in resource or attribute (prio: resource > attribute)
LogsIndex string `mapstructure:"logs_index"`
LogsDynamicIndex DynamicIndexSetting `mapstructure:"logs_dynamic_index"`

// This setting is required when the exporter is used in a metrics pipeline.
MetricsIndex string `mapstructure:"metrics_index"`
// fall back to pure MetricsIndex, if 'elasticsearch.index.prefix' or 'elasticsearch.index.suffix' are not found in resource attributes
MetricsIndex string `mapstructure:"metrics_index"`
MetricsDynamicIndex DynamicIndexSetting `mapstructure:"metrics_dynamic_index"`

// This setting is required when traces pipelines used.
TracesIndex string `mapstructure:"traces_index"`
// fall back to pure TracesIndex, if 'elasticsearch.index.prefix' or 'elasticsearch.index.suffix' are not found in resource or attribute (prio: resource > attribute)
TracesIndex string `mapstructure:"traces_index"`
TracesDynamicIndex DynamicIndexSetting `mapstructure:"traces_dynamic_index"`

// LogsDynamicID configures whether log record attribute `elasticsearch.document_id` is set as the document ID in ES.
Expand Down Expand Up @@ -118,6 +115,9 @@ type LogstashFormatSettings struct {
}

type DynamicIndexSetting struct {
// Enabled enables dynamic index routing.
//
// Deprecated: This config is now ignored. Dynamic index routing is always done by default.
Enabled bool `mapstructure:"enabled"`
}

Expand Down Expand Up @@ -281,6 +281,16 @@ func (cfg *Config) Validate() error {
return errors.New("retry::max_retries should be non-negative")
}

if cfg.LogsIndex != "" && cfg.LogsDynamicIndex.Enabled {
return errors.New("must not specify both logs_index and logs_dynamic_index; logs_index should be empty unless all documents should be sent to the same index")
}
if cfg.MetricsIndex != "" && cfg.MetricsDynamicIndex.Enabled {
return errors.New("must not specify both metrics_index and metrics_dynamic_index; metrics_index should be empty unless all documents should be sent to the same index")
}
if cfg.TracesIndex != "" && cfg.TracesDynamicIndex.Enabled {
return errors.New("must not specify both traces_index and traces_dynamic_index; traces_index should be empty unless all documents should be sent to the same index")
}

return nil
}

Expand Down Expand Up @@ -390,4 +400,13 @@ func handleDeprecatedConfig(cfg *Config, logger *zap.Logger) {
// Do not set cfg.Retry.Enabled = false if cfg.Retry.MaxRequest = 1 to avoid breaking change on behavior
logger.Warn("retry::max_requests has been deprecated, and will be removed in a future version. Use retry::max_retries instead.")
}
if cfg.LogsDynamicIndex.Enabled {
logger.Warn("logs_dynamic_index::enabled has been deprecated, and will be removed in a future version. It is now a no-op. Dynamic document routing is now the default. See Elasticsearch Exporter README.")
}
if cfg.MetricsDynamicIndex.Enabled {
logger.Warn("metrics_dynamic_index::enabled has been deprecated, and will be removed in a future version. It is now a no-op. Dynamic document routing is now the default. See Elasticsearch Exporter README.")
}
if cfg.TracesDynamicIndex.Enabled {
logger.Warn("traces_dynamic_index::enabled has been deprecated, and will be removed in a future version. It is now a no-op. Dynamic document routing is now the default. See Elasticsearch Exporter README.")
}
}
12 changes: 3 additions & 9 deletions exporter/elasticsearchexporter/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -63,13 +63,11 @@ func TestConfig(t *testing.T) {
QueueSize: exporterhelper.NewDefaultQueueConfig().QueueSize,
},
Endpoints: []string{"https://elastic.example.com:9200"},
LogsIndex: "logs-generic-default",
LogsDynamicIndex: DynamicIndexSetting{
Enabled: false,
},
MetricsIndex: "metrics-generic-default",
MetricsDynamicIndex: DynamicIndexSetting{
Enabled: true,
Enabled: false,
},
TracesIndex: "trace_index",
TracesDynamicIndex: DynamicIndexSetting{
Expand Down Expand Up @@ -142,11 +140,9 @@ func TestConfig(t *testing.T) {
LogsDynamicIndex: DynamicIndexSetting{
Enabled: false,
},
MetricsIndex: "metrics-generic-default",
MetricsDynamicIndex: DynamicIndexSetting{
Enabled: true,
Enabled: false,
},
TracesIndex: "traces-generic-default",
TracesDynamicIndex: DynamicIndexSetting{
Enabled: false,
},
Expand Down Expand Up @@ -213,15 +209,13 @@ func TestConfig(t *testing.T) {
QueueSize: exporterhelper.NewDefaultQueueConfig().QueueSize,
},
Endpoints: []string{"http://localhost:9200"},
LogsIndex: "logs-generic-default",
LogsDynamicIndex: DynamicIndexSetting{
Enabled: false,
},
MetricsIndex: "my_metric_index",
MetricsDynamicIndex: DynamicIndexSetting{
Enabled: true,
Enabled: false,
},
TracesIndex: "traces-generic-default",
TracesDynamicIndex: DynamicIndexSetting{
Enabled: false,
},
Expand Down
Loading
Loading