Skip to content

Commit a727b99

Browse files
trasklmolkova
andauthored
Parameterized query text does not need to be sanitized by default (open-telemetry#976)
Co-authored-by: Liudmila Molkova <[email protected]>
1 parent 5ed38b4 commit a727b99

File tree

5 files changed

+53
-16
lines changed

5 files changed

+53
-16
lines changed

.chloggen/976.yaml

+22
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
# Use this changelog template to create an entry for release notes.
2+
#
3+
# If your change doesn't affect end users you should instead start
4+
# your pull request title with [chore] or use the "Skip Changelog" label.
5+
6+
# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
7+
change_type: enhancement
8+
9+
# The name of the area of concern in the attributes-registry, (e.g. http, cloud, db)
10+
component: db
11+
12+
# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
13+
note: Parameterized query text does not need to be sanitized by default
14+
15+
# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
16+
# The values here must be integers.
17+
issues: [ 976 ]
18+
19+
# (Optional) One or more lines of additional information to render under the primary note.
20+
# These lines will be padded with 2 spaces and then inserted directly into the document.
21+
# Use pipe (|) for multiline entries.
22+
subtext:

docs/database/database-spans.md

+11-8
Original file line numberDiff line numberDiff line change
@@ -91,11 +91,11 @@ These attributes will usually be the same for all operations performed over the
9191
| [`db.operation.name`](/docs/attributes-registry/db.md) | string | The name of the operation or command being executed. | `findAndModify`; `HMSET`; `SELECT` | `Conditionally Required` [4] | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
9292
| [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [5] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` If and only if the operation failed. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
9393
| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [6] | `80`; `8080`; `443` | `Conditionally Required` [7] | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
94-
| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. | `SELECT * FROM wuser_table where username = ?`; `SET mykey "WuValue"` | `Recommended` [8] | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
95-
| [`network.peer.address`](/docs/attributes-registry/network.md) | string | Peer address of the database node where the operation was performed. [9] | `10.1.2.80`; `/tmp/my.sock` | `Recommended` If applicable for this database system. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
94+
| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [8] | `SELECT * FROM wuser_table where username = ?`; `SET mykey "WuValue"` | `Recommended` [9] | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
95+
| [`network.peer.address`](/docs/attributes-registry/network.md) | string | Peer address of the database node where the operation was performed. [10] | `10.1.2.80`; `/tmp/my.sock` | `Recommended` If applicable for this database system. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
9696
| [`network.peer.port`](/docs/attributes-registry/network.md) | int | Peer port number of the network connection. | `65123` | `Recommended` if and only if `network.peer.address` is set. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
97-
| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [10] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
98-
| [`db.query.parameter.<key>`](/docs/attributes-registry/db.md) | string | The query parameters used in `db.query.text`, with `<key>` being the parameter name, and the attribute value being the parameter value. [11] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
97+
| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [11] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
98+
| [`db.query.parameter.<key>`](/docs/attributes-registry/db.md) | string | The query parameters used in `db.query.text`, with `<key>` being the parameter name, and the attribute value being the parameter value. [12] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
9999

100100
**[1]:** If the collection name is parsed from the query, it SHOULD match the value provided in the query and may be qualified with the schema and database name.
101101

@@ -112,14 +112,17 @@ Semantic conventions for individual database systems SHOULD document what `db.na
112112

113113
**[7]:** If using a port other than the default port for this DBMS and if `server.address` is set.
114114

115-
**[8]:** SHOULD be collected by default only if there is sanitization that excludes sensitive information.
115+
**[8]:** Even though parameterized query text can potentially have sensitive data, by using a parameterized query the user is giving a strong signal that any sensitive data will be passed as parameter values, and the benefit to observability of capturing the static part of the query text by default outweighs the risk.
116116

117-
**[9]:** Semantic conventions for individual database systems SHOULD document whether `network.peer.*` attributes are applicable. Network peer address and port are useful when the application interacts with individual database nodes directly.
117+
**[9]:** Non-parameterized query text SHOULD NOT be collected by default unless there is sanitization that excludes sensitive data, e.g. by redacting all literal values present in the query text.
118+
Parameterized query text SHOULD be collected by default (the query parameter values themselves are opt-in, see [`db.query.parameter.<key>`](../../docs/attributes-registry/db.md)).
119+
120+
**[10]:** Semantic conventions for individual database systems SHOULD document whether `network.peer.*` attributes are applicable. Network peer address and port are useful when the application interacts with individual database nodes directly.
118121
If a database operation involved multiple network calls (for example retries), the address of the last contacted node SHOULD be used.
119122

120-
**[10]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available.
123+
**[11]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available.
121124

122-
**[11]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders.
125+
**[12]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders.
123126
If a parameter has no name and instead is referenced only by index, then `<key>` SHOULD be the 0-based index.
124127

125128
`db.system` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

docs/database/elasticsearch.md

+8-6
Original file line numberDiff line numberDiff line change
@@ -33,10 +33,10 @@ If the endpoint id is not available, the span name SHOULD be the `http.request.m
3333
| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [5] | `80`; `8080`; `443` | `Conditionally Required` [6] | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
3434
| [`db.elasticsearch.cluster.name`](/docs/attributes-registry/db.md) | string | Represents the identifier of an Elasticsearch cluster. | `e9106fc68e3044f0b1475b04bf4ffd5f` | `Recommended` [7] | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
3535
| [`db.elasticsearch.node.name`](/docs/attributes-registry/db.md) | string | Represents the human-readable identifier of the node/instance to which a request was routed. | `instance-0000000001` | `Recommended` [8] | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
36-
| [`db.query.text`](/docs/attributes-registry/db.md) | string | The request body for a [search-type query](https://www.elastic.co/guide/en/elasticsearch/reference/current/search.html), as a json string. | `"{\"query\":{\"term\":{\"user.id\":\"kimchy\"}}}"` | `Recommended` [9] | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
37-
| [`network.peer.address`](/docs/attributes-registry/network.md) | string | Peer address of the database node where the operation was performed. [10] | `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
36+
| [`db.query.text`](/docs/attributes-registry/db.md) | string | The request body for a [search-type query](https://www.elastic.co/guide/en/elasticsearch/reference/current/search.html), as a json string. [9] | `"{\"query\":{\"term\":{\"user.id\":\"kimchy\"}}}"` | `Recommended` [10] | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
37+
| [`network.peer.address`](/docs/attributes-registry/network.md) | string | Peer address of the database node where the operation was performed. [11] | `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
3838
| [`network.peer.port`](/docs/attributes-registry/network.md) | int | Peer port number of the network connection. | `65123` | `Recommended` if and only if `network.peer.address` is set. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
39-
| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [11] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
39+
| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [12] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
4040

4141
**[1]:** This SHOULD be the endpoint identifier for the request.
4242

@@ -69,11 +69,13 @@ Tracing instrumentations that do so, MUST also set `http.request.method_original
6969

7070
**[8]:** When communicating with an Elastic Cloud deployment, this should be collected from the "X-Found-Handling-Instance" HTTP response header.
7171

72-
**[9]:** Should be collected by default for search-type queries and only if there is sanitization that excludes sensitive information.
72+
**[9]:** Even though parameterized query text can potentially have sensitive data, by using a parameterized query the user is giving a strong signal that any sensitive data will be passed as parameter values, and the benefit to observability of capturing the static part of the query text by default outweighs the risk.
7373

74-
**[10]:** If a database operation involved multiple network calls (for example retries), the address of the last contacted node SHOULD be used.
74+
**[10]:** Should be collected by default for search-type queries and only if there is sanitization that excludes sensitive information.
7575

76-
**[11]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available.
76+
**[11]:** If a database operation involved multiple network calls (for example retries), the address of the last contacted node SHOULD be used.
77+
78+
**[12]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available.
7779

7880
`http.request.method` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
7981

docs/database/redis.md

+2-1
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,8 @@ For commands that switch the database, this SHOULD be set to the target database
2828

2929
**[2]:** For **Redis**, the value provided for `db.query.text` SHOULD correspond to the syntax of the Redis CLI. If, for example, the [`HMSET` command](https://redis.io/commands/hmset) is invoked, `"HMSET myhash field1 'Hello' field2 'World'"` would be a suitable value for `db.query.text`.
3030

31-
**[3]:** SHOULD be collected by default only if there is sanitization that excludes sensitive information.
31+
**[3]:** Non-parameterized query text SHOULD NOT be collected by default unless there is sanitization that excludes sensitive data, e.g. by redacting all literal values present in the query text.
32+
Parameterized query text SHOULD be collected by default (the query parameter values themselves are opt-in, see [`db.query.parameter.<key>`](../../docs/attributes-registry/db.md)).
3233

3334
**[4]:** If a database operation involved multiple network calls (for example retries), the address of the last contacted node SHOULD be used.
3435
<!-- endsemconv -->

model/trace/database.yaml

+10-1
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,16 @@ groups:
77
- ref: db.query.text
88
requirement_level:
99
recommended: >
10-
SHOULD be collected by default only if there is sanitization that excludes sensitive information.
10+
Non-parameterized query text SHOULD NOT be collected by default unless there is sanitization that excludes
11+
sensitive data, e.g. by redacting all literal values present in the query text.
12+
13+
Parameterized query text SHOULD be collected by default
14+
(the query parameter values themselves are opt-in,
15+
see [`db.query.parameter.<key>`](../../docs/attributes-registry/db.md)).
16+
note:
17+
Even though parameterized query text can potentially have sensitive data, by using a parameterized query
18+
the user is giving a strong signal that any sensitive data will be passed as parameter values, and the benefit
19+
to observability of capturing the static part of the query text by default outweighs the risk.
1120
- ref: db.query.parameter
1221
requirement_level: opt_in
1322

0 commit comments

Comments
 (0)