Skip to content

Commit 662feae

Browse files
[exporter/elasticsearch] Dynamically route documents by default (open-telemetry#38500)
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description Breaking change. Overhaul in document routing. New document routing logic: ``` Documents are statically or dynamically routed to the target index / data stream in the following order. The first routing mode that applies will be used. 1. "Static mode": Route to `logs_index` for log records, `metrics_index` for data points and `traces_index` for spans, if these configs are not empty respectively. [^3] 2. "Dynamic - Index attribute mode": Route to index name specified in `elasticsearch.index` attribute (precedence: log record / data point / span attribute > scope attribute > resource attribute) if the attribute exists. [^3] 3. "Dynamic - Data stream routing mode": Route to data stream constructed from `${data_stream.type}-${data_stream.dataset}-${data_stream.namespace}`, where `data_stream.type` is `logs` for log records, `metrics` for data points, and `traces` for spans, and is static. [^3] In a special case with `mapping::mode: bodymap`, `data_stream.type` field (valid values: `logs`, `metrics`) can be dynamically set from attributes. The resulting documents will contain the corresponding `data_stream.*` fields, see restrictions applied to [Data Stream Fields](https://www.elastic.co/guide/en/ecs/current/ecs-data_stream.html). 1. `data_stream.dataset` or `data_stream.namespace` in attributes (precedence: log record / data point / span attribute > scope attribute > resource attribute) 2. Otherwise, if scope name matches regex `/receiver/(\w*receiver)`, `data_stream.dataset` will be capture group #1 3. Otherwise, `data_stream.dataset` falls back to `generic` and `data_stream.namespace` falls back to `default`. ``` ``` In OTel mapping mode (`mapping::mode: otel`), there is special handling in addition to the above document routing rules in [Elasticsearch document routing](#elasticsearch-document-routing). The order to determine the routing mode is the same as [Elasticsearch document routing](#elasticsearch-document-routing). 1. "Static mode": Span events are separate documents routed to `logs_index` if non-empty. 2. "Dynamic - Index attribute mode": Span events are separate documents routed using attribute `elasticsearch.index` (precedence: span event attribute > scope attribute > resource attribute) if the attribute exists. 3. "Dynamic - Data stream routing mode": - For all documents, `data_stream.dataset` will always be appended with `.otel`. - A special case to (3)(1) in [Elasticsearch document routing](#elasticsearch-document-routing), span events are separate documents that have `data_stream.type: logs` and are routed using data stream attributes (precedence: span event attribute > scope attribute > resource attribute) ``` Effective changes: - Deprecate and make `{logs,metrics,traces}_dynamic_index` config no-op - Config validation error on `{logs,metrics,traces}_dynamic_index::enabled` and `{logs,metrics,traces}_index` set at the same time, as users who rely on dynamic index should not set `{logs,metrics,traces}_index`. - Remove `elasticsearch.index.{prefix,suffix}` handling. Replace it with `elasticsearch.index` handling that uses attribute value as index directly. Users rely on the previously supported `elasticsearch.index.prefix` and `elasticsearch.index.suffix` should migrate to a transform processor that sets `elasticsearch.index`. - Fix a bug where receiver-based routing overwrites data_stream.dataset. Should be released together with open-telemetry#38458 <!-- Issue number (e.g. #1234) or full URL to issue, if applicable. --> #### Link to tracking issue Fixes open-telemetry#38361 <!--Describe what testing was performed and which tests were added.--> #### Testing <!--Describe the documentation added.--> #### Documentation <!--Please delete paragraphs that you did not use before submitting.--> --------- Co-authored-by: Andrzej Stencel <[email protected]>
1 parent c232180 commit 662feae

16 files changed

+344
-239
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# Use this changelog template to create an entry for release notes.
2+
3+
# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
4+
change_type: breaking
5+
6+
# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
7+
component: elasticsearchexporter
8+
9+
# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
10+
note: Dynamically route documents by default unless `{logs,metrics,traces}_index` is non-empty
11+
12+
# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
13+
issues: [38361]
14+
15+
# (Optional) One or more lines of additional information to render under the primary note.
16+
# These lines will be padded with 2 spaces and then inserted directly into the document.
17+
# Use pipe (|) for multiline entries.
18+
subtext:
19+
Overhaul in document routing.
20+
Deprecate and make `{logs,metrics,traces}_dynamic_index` config no-op.
21+
Config validation error on `{logs,metrics,traces}_dynamic_index::enabled` and `{logs,metrics,traces}_index` set at the same time, as users who rely on dynamic index should not set `{logs,metrics,traces}_index`.
22+
Remove `elasticsearch.index.{prefix,suffix}` handling. Replace it with `elasticsearch.index` handling that uses attribute value as index directly. Users rely on the previously supported `elasticsearch.index.prefix` and `elasticsearch.index.suffix` should migrate to a transform processor that sets `elasticsearch.index`.
23+
Fix a bug where receiver-based routing overwrites data_stream.dataset.
24+
25+
# If your change doesn't affect end users or the exported elements of any package,
26+
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
27+
# Optional: The change log or logs in which this entry should be included.
28+
# e.g. '[user]' or '[user, api]'
29+
# Include 'user' if the change is relevant to end users.
30+
# Include 'api' if there is a change to a library API.
31+
# Default: '[user]'
32+
change_logs: [user]

exporter/elasticsearchexporter/README.md

+37-16
Original file line numberDiff line numberDiff line change
@@ -118,33 +118,39 @@ Using the common `batcher` functionality provides several benefits over the defa
118118

119119
### Elasticsearch document routing
120120

121-
Telemetry data will be written to signal specific data streams by default:
122-
logs to `logs-generic-default`, metrics to `metrics-generic-default`, and traces to `traces-generic-default`.
121+
Documents are statically or dynamically routed to the target index / data stream in the following order. The first routing mode that applies will be used.
122+
1. "Static mode": Route to `logs_index` for log records, `metrics_index` for data points and `traces_index` for spans, if these configs are not empty respectively. [^3]
123+
2. "Dynamic - Index attribute mode": Route to index name specified in `elasticsearch.index` attribute (precedence: log record / data point / span attribute > scope attribute > resource attribute) if the attribute exists. [^3]
124+
3. "Dynamic - Data stream routing mode": Route to data stream constructed from `${data_stream.type}-${data_stream.dataset}-${data_stream.namespace}`,
125+
where `data_stream.type` is `logs` for log records, `metrics` for data points, and `traces` for spans, and is static. [^3]
126+
In a special case with `mapping::mode: bodymap`, `data_stream.type` field (valid values: `logs`, `metrics`) can be dynamically set from attributes.
127+
The resulting documents will contain the corresponding `data_stream.*` fields, see restrictions applied to [Data Stream Fields](https://www.elastic.co/guide/en/ecs/current/ecs-data_stream.html).
128+
1. `data_stream.dataset` or `data_stream.namespace` in attributes (precedence: log record / data point / span attribute > scope attribute > resource attribute)
129+
2. Otherwise, if scope name matches regex `/receiver/(\w*receiver)`, `data_stream.dataset` will be capture group #1
130+
3. Otherwise, `data_stream.dataset` falls back to `generic` and `data_stream.namespace` falls back to `default`.
131+
132+
[^3]: See additional handling in [Document routing exceptions for OTel data mode](#document-routing-exceptions-for-otel-data-mode)
133+
123134
This can be customised through the following settings:
124135

125-
- `logs_index`: The [index] or [data stream] name to publish events to. The default value is `logs-generic-default`
136+
- `logs_index` (optional): The [index] or [data stream] name to publish logs (and span events in OTel mapping mode) to. `logs_index` should be empty unless all logs should be sent to the same index.
126137

127138
- `logs_dynamic_index` (optional): uses resource, scope, or log record attributes to dynamically construct index name.
128-
- `enabled`(default=false): Enable/Disable dynamic index for log records. If `data_stream.dataset` or `data_stream.namespace` exist in attributes (precedence: log record attribute > scope attribute > resource attribute), they will be used to dynamically construct index name in the form `logs-${data_stream.dataset}-${data_stream.namespace}`. In a special case with `mapping::mode: bodymap`, `data_stream.type` field (valid values: `logs`, `metrics`) is also supported to dynamically construct index in the form `${data_stream.type}-${data_stream.dataset}-${data_stream.namespace}`. Otherwise, if
129-
`elasticsearch.index.prefix` or `elasticsearch.index.suffix` exist in attributes (precedence: resource attribute > scope attribute > log record attribute), they will be used to dynamically construct index name in the form `${elasticsearch.index.prefix}${logs_index}${elasticsearch.index.suffix}`. Otherwise, if scope name matches regex `/receiver/(\w*receiver)`, `data_stream.dataset` will be capture group #1. Otherwise, the index name falls back to `logs-generic-default`, and `logs_index` config will be ignored. Except for prefix/suffix attribute presence, the resulting docs will contain the corresponding `data_stream.*` fields, see restrictions applied to [Data Stream Fields](https://www.elastic.co/guide/en/ecs/current/ecs-data_stream.html).
139+
- `enabled`(DEPRECATED): No-op. Documents are now always routed dynamically unless `logs_index` is not empty. Will be removed in a future version.
130140

131-
- `metrics_index` (optional): The [index] or [data stream] name to publish metrics to. The default value is `metrics-generic-default`.
132-
⚠️ Note that metrics support is currently in development.
141+
- `metrics_index` (optional): The [index] or [data stream] name to publish metrics to. `metrics_index` should be empty unless all metrics should be sent to the same index. Note that metrics support is currently in development.
133142

134143
- `metrics_dynamic_index` (optional): uses resource, scope or data point attributes to dynamically construct index name.
135-
⚠️ Note that metrics support is currently in development.
136-
- `enabled`(default=true): Enable/disable dynamic index for metrics. If `data_stream.dataset` or `data_stream.namespace` exist in attributes (precedence: data point attribute > scope attribute > resource attribute), they will be used to dynamically construct index name in the form `metrics-${data_stream.dataset}-${data_stream.namespace}`. Otherwise, if
137-
`elasticsearch.index.prefix` or `elasticsearch.index.suffix` exist in attributes (precedence: resource attribute > scope attribute > data point attribute), they will be used to dynamically construct index name in the form `${elasticsearch.index.prefix}${metrics_index}${elasticsearch.index.suffix}`. Otherwise, if scope name matches regex `/receiver/(\w*receiver)`, `data_stream.dataset` will be capture group #1. Otherwise, the index name falls back to `metrics-generic-default`, and `metrics_index` config will be ignored. Except for prefix/suffix attribute presence, the resulting docs will contain the corresponding `data_stream.*` fields, see restrictions applied to [Data Stream Fields](https://www.elastic.co/guide/en/ecs/current/ecs-data_stream.html).
144+
- `enabled`(DEPRECATED): No-op. Documents are now always routed dynamically unless `metrics_index` is not empty. Will be removed in a future version.
138145

139-
- `traces_index`: The [index] or [data stream] name to publish traces to. The default value is `traces-generic-default`.
146+
- `traces_index` (optional): The [index] or [data stream] name to publish traces to. `traces_index` should be empty unless all traces should be sent to the same index.
140147

141148
- `traces_dynamic_index` (optional): uses resource, scope, or span attributes to dynamically construct index name.
142-
- `enabled`(default=false): Enable/Disable dynamic index for trace spans. If `data_stream.dataset` or `data_stream.namespace` exist in attributes (precedence: span attribute > scope attribute > resource attribute), they will be used to dynamically construct index name in the form `traces-${data_stream.dataset}-${data_stream.namespace}`. Otherwise, if
143-
`elasticsearch.index.prefix` or `elasticsearch.index.suffix` exist in attributes (precedence: resource attribute > scope attribute > span attribute), they will be used to dynamically construct index name in the form `${elasticsearch.index.prefix}${traces_index}${elasticsearch.index.suffix}`. Otherwise, if scope name matches regex `/receiver/(\w*receiver)`, `data_stream.dataset` will be capture group #1. Otherwise, the index name falls back to `traces-generic-default`, and `traces_index` config will be ignored. Except for prefix/suffix attribute presence, the resulting docs will contain the corresponding `data_stream.*` fields, see restrictions applied to [Data Stream Fields](https://www.elastic.co/guide/en/ecs/current/ecs-data_stream.html). There is an exception for span events under OTel mapping mode (`mapping::mode: otel`), where span event attributes instead of span attributes are considered, and `data_stream.type` is always `logs` instead of `traces` such that documents are routed to `logs-${data_stream.dataset}-${data_stream.namespace}`.
149+
- `enabled`(DEPRECATED): No-op. Documents are now always routed dynamically unless `traces_index` is not empty. Will be removed in a future version.
144150

145151
- `logstash_format` (optional): Logstash format compatibility. Logs, metrics and traces can be written into an index in Logstash format.
146-
- `enabled`(default=false): Enable/disable Logstash format compatibility. When `logstash_format.enabled` is `true`, the index name is composed using `(logs|metrics|traces)_index` or `(logs|metrics|traces)_dynamic_index` as prefix and the date as suffix,
147-
e.g: If `logs_index` or `logs_dynamic_index` is equal to `logs-generic-default`, your index will become `logs-generic-default-YYYY.MM.DD`.
152+
- `enabled`(default=false): Enable/disable Logstash format compatibility. When `logstash_format::enabled` is `true`, the index name is composed using the above dynamic routing rules as prefix and the date as suffix,
153+
e.g: If the computed index name is `logs-generic-default`, the resulting index will be `logs-generic-default-YYYY.MM.DD`.
148154
The last string appended belongs to the date when the data is being generated.
149155
- `prefix_separator`(default=`-`): Set a separator between logstash_prefix and date.
150156
- `date_format`(default=`%Y.%m.%d`): Time format (based on strftime) to generate the second part of the Index name.
@@ -154,6 +160,19 @@ This can be customised through the following settings:
154160

155161

156162

163+
#### Document routing exceptions for OTel data mode
164+
165+
In OTel mapping mode (`mapping::mode: otel`), there is special handling in addition to the above document routing rules in [Elasticsearch document routing](#elasticsearch-document-routing).
166+
The order to determine the routing mode is the same as [Elasticsearch document routing](#elasticsearch-document-routing).
167+
168+
1. "Static mode": Span events are separate documents routed to `logs_index` if non-empty.
169+
2. "Dynamic - Index attribute mode": Span events are separate documents routed using attribute `elasticsearch.index` (precedence: span event attribute > scope attribute > resource attribute) if the attribute exists.
170+
3. "Dynamic - Data stream routing mode":
171+
- For all documents, `data_stream.dataset` will always be appended with `.otel`.
172+
- A special case to (3)(1) in [Elasticsearch document routing](#elasticsearch-document-routing), span events are separate documents that have `data_stream.type: logs` and are routed using data stream attributes (precedence: span event attribute > scope attribute > resource attribute)
173+
174+
175+
157176
### Elasticsearch document mapping
158177

159178
The Elasticsearch exporter supports several document schemas and preprocessing
@@ -198,7 +217,9 @@ scheme that maps these as `constant_keyword` fields.
198217
`data_stream.dataset` will always be appended with `.otel` if [dynamic data stream routing mode](#elasticsearch-document-routing) is active.
199218

200219
Span events are stored in separate documents. They will be routed with `data_stream.type` set to
201-
`logs` if `traces_dynamic_index::enabled` is `true`.
220+
`logs` if [dynamic data stream routing mode](#elasticsearch-document-routing) is active.
221+
222+
Attribute `elasticsearch.index` will be removed from the final document if exists.
202223

203224
| Signal | Supported |
204225
| --------- | ------------------ |

exporter/elasticsearchexporter/attribute.go

-2
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,6 @@ import "go.opentelemetry.io/collector/pdata/pcommon"
77

88
// dynamic index attribute key constants
99
const (
10-
indexPrefix = "elasticsearch.index.prefix"
11-
indexSuffix = "elasticsearch.index.suffix"
1210
defaultDataStreamDataset = "generic"
1311
defaultDataStreamNamespace = "default"
1412
defaultDataStreamTypeLogs = "logs"

exporter/elasticsearchexporter/config.go

+31-9
Original file line numberDiff line numberDiff line change
@@ -40,19 +40,19 @@ type Config struct {
4040
// NumWorkers configures the number of workers publishing bulk requests.
4141
NumWorkers int `mapstructure:"num_workers"`
4242

43-
// This setting is required when logging pipelines used.
44-
LogsIndex string `mapstructure:"logs_index"`
45-
// fall back to pure LogsIndex, if 'elasticsearch.index.prefix' or 'elasticsearch.index.suffix' are not found in resource or attribute (prio: resource > attribute)
43+
// LogsIndex configures the static index used for document routing for logs.
44+
// It should be empty if dynamic document routing is preferred.
45+
LogsIndex string `mapstructure:"logs_index"`
4646
LogsDynamicIndex DynamicIndexSetting `mapstructure:"logs_dynamic_index"`
4747

48-
// This setting is required when the exporter is used in a metrics pipeline.
49-
MetricsIndex string `mapstructure:"metrics_index"`
50-
// fall back to pure MetricsIndex, if 'elasticsearch.index.prefix' or 'elasticsearch.index.suffix' are not found in resource attributes
48+
// MetricsIndex configures the static index used for document routing for metrics.
49+
// It should be empty if dynamic document routing is preferred.
50+
MetricsIndex string `mapstructure:"metrics_index"`
5151
MetricsDynamicIndex DynamicIndexSetting `mapstructure:"metrics_dynamic_index"`
5252

53-
// This setting is required when traces pipelines used.
54-
TracesIndex string `mapstructure:"traces_index"`
55-
// fall back to pure TracesIndex, if 'elasticsearch.index.prefix' or 'elasticsearch.index.suffix' are not found in resource or attribute (prio: resource > attribute)
53+
// TracesIndex configures the static index used for document routing for metrics.
54+
// It should be empty if dynamic document routing is preferred.
55+
TracesIndex string `mapstructure:"traces_index"`
5656
TracesDynamicIndex DynamicIndexSetting `mapstructure:"traces_dynamic_index"`
5757

5858
// LogsDynamicID configures whether log record attribute `elasticsearch.document_id` is set as the document ID in ES.
@@ -121,6 +121,9 @@ type LogstashFormatSettings struct {
121121
}
122122

123123
type DynamicIndexSetting struct {
124+
// Enabled enables dynamic index routing.
125+
//
126+
// Deprecated: [v0.122.0] This config is now ignored. Dynamic index routing is always done by default.
124127
Enabled bool `mapstructure:"enabled"`
125128
}
126129

@@ -288,6 +291,16 @@ func (cfg *Config) Validate() error {
288291
return errors.New("retry::max_retries should be non-negative")
289292
}
290293

294+
if cfg.LogsIndex != "" && cfg.LogsDynamicIndex.Enabled {
295+
return errors.New("must not specify both logs_index and logs_dynamic_index; logs_index should be empty unless all documents should be sent to the same index")
296+
}
297+
if cfg.MetricsIndex != "" && cfg.MetricsDynamicIndex.Enabled {
298+
return errors.New("must not specify both metrics_index and metrics_dynamic_index; metrics_index should be empty unless all documents should be sent to the same index")
299+
}
300+
if cfg.TracesIndex != "" && cfg.TracesDynamicIndex.Enabled {
301+
return errors.New("must not specify both traces_index and traces_dynamic_index; traces_index should be empty unless all documents should be sent to the same index")
302+
}
303+
291304
return nil
292305
}
293306

@@ -397,4 +410,13 @@ func handleDeprecatedConfig(cfg *Config, logger *zap.Logger) {
397410
// Do not set cfg.Retry.Enabled = false if cfg.Retry.MaxRequest = 1 to avoid breaking change on behavior
398411
logger.Warn("retry::max_requests has been deprecated, and will be removed in a future version. Use retry::max_retries instead.")
399412
}
413+
if cfg.LogsDynamicIndex.Enabled {
414+
logger.Warn("logs_dynamic_index::enabled has been deprecated, and will be removed in a future version. It is now a no-op. Dynamic document routing is now the default. See Elasticsearch Exporter README.")
415+
}
416+
if cfg.MetricsDynamicIndex.Enabled {
417+
logger.Warn("metrics_dynamic_index::enabled has been deprecated, and will be removed in a future version. It is now a no-op. Dynamic document routing is now the default. See Elasticsearch Exporter README.")
418+
}
419+
if cfg.TracesDynamicIndex.Enabled {
420+
logger.Warn("traces_dynamic_index::enabled has been deprecated, and will be removed in a future version. It is now a no-op. Dynamic document routing is now the default. See Elasticsearch Exporter README.")
421+
}
400422
}

exporter/elasticsearchexporter/config_test.go

+3-9
Original file line numberDiff line numberDiff line change
@@ -63,13 +63,11 @@ func TestConfig(t *testing.T) {
6363
QueueSize: exporterhelper.NewDefaultQueueConfig().QueueSize,
6464
},
6565
Endpoints: []string{"https://elastic.example.com:9200"},
66-
LogsIndex: "logs-generic-default",
6766
LogsDynamicIndex: DynamicIndexSetting{
6867
Enabled: false,
6968
},
70-
MetricsIndex: "metrics-generic-default",
7169
MetricsDynamicIndex: DynamicIndexSetting{
72-
Enabled: true,
70+
Enabled: false,
7371
},
7472
TracesIndex: "trace_index",
7573
TracesDynamicIndex: DynamicIndexSetting{
@@ -145,11 +143,9 @@ func TestConfig(t *testing.T) {
145143
LogsDynamicIndex: DynamicIndexSetting{
146144
Enabled: false,
147145
},
148-
MetricsIndex: "metrics-generic-default",
149146
MetricsDynamicIndex: DynamicIndexSetting{
150-
Enabled: true,
147+
Enabled: false,
151148
},
152-
TracesIndex: "traces-generic-default",
153149
TracesDynamicIndex: DynamicIndexSetting{
154150
Enabled: false,
155151
},
@@ -219,15 +215,13 @@ func TestConfig(t *testing.T) {
219215
QueueSize: exporterhelper.NewDefaultQueueConfig().QueueSize,
220216
},
221217
Endpoints: []string{"http://localhost:9200"},
222-
LogsIndex: "logs-generic-default",
223218
LogsDynamicIndex: DynamicIndexSetting{
224219
Enabled: false,
225220
},
226221
MetricsIndex: "my_metric_index",
227222
MetricsDynamicIndex: DynamicIndexSetting{
228-
Enabled: true,
223+
Enabled: false,
229224
},
230-
TracesIndex: "traces-generic-default",
231225
TracesDynamicIndex: DynamicIndexSetting{
232226
Enabled: false,
233227
},

0 commit comments

Comments
 (0)