-
Notifications
You must be signed in to change notification settings - Fork 2.7k
feat(dockerstats): add online cpu and memory failcnt metrics #31481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
3375d41
391b782
d10a3bd
ebcd8ab
700018a
50f47cd
e58ab49
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
# Use this changelog template to create an entry for release notes. | ||
|
||
# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix' | ||
change_type: enhancement | ||
|
||
# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver) | ||
component: dockerstatsreceiver | ||
|
||
# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`). | ||
note: add metrics for online CPU count and memory fails count | ||
|
||
# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists. | ||
issues: [31366] | ||
|
||
# (Optional) One or more lines of additional information to render under the primary note. | ||
# These lines will be padded with 2 spaces and then inserted directly into the document. | ||
# Use pipe (|) for multiline entries. | ||
subtext: | ||
|
||
# If your change doesn't affect end users or the exported elements of any package, | ||
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label. | ||
# Optional: The change log or logs in which this entry should be included. | ||
# e.g. '[user]' or '[user, api]' | ||
# Include 'user' if the change is relevant to end users. | ||
# Include 'api' if there is a change to a library API. | ||
# Default: '[user]' | ||
change_logs: [user] |
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any other documentation available from docker? It's hard to understand what this metric is needed for from this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't find any documentation from Docker, only some references from the internet that says how people interpret this:
"The number of real used CPUs is given by cpu_stats.online_cpus (4 in our case)." - influxdata/telegraf#2964
"The online_cpus value represents the number of CPU core on the host machine." - https://cratedb.com/blog/analyzing-docker-container-performance-native-tools
Even the source code doesn't say much: https://github.com/moby/moby/blob/460b4aebdf81512a3ea37c2a65826b890427537d/api/types/stats.go#L50
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Digging deeper into the code, this is the "cpu_num" return value here: https://github.com/moby/moby/blob/460b4aebdf81512a3ea37c2a65826b890427537d/daemon/stats_unix.go#L321
If we go to the
man
page in the comment, we can see that the command lists cpus with a number likecpu0
,cpu1
etc. and Docker just sums number of these different values.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, it seems like just a number of cores available to the container. We have
system.cpu.logical.count
for this purpose on the host, see https://github.com/open-telemetry/semantic-conventions/blob/6c79a1d9107c1833d95177beda84947ce1902792/model/metrics/system-metrics.yaml#L90C18-L90C42. Should we call itcontainer.cpu.logical.count
? I think online here just adds confusion. cc @open-telemetry/semconv-container-approversThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would make sense to introduce it as
container.cpu.logical_count
container.cpu.logical.count
to be aligned withsystem.*
, as per our discussions at open-telemetry/semantic-conventions#282 (comment).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed the name to
container.cpu.logical_count
as the suggestion said.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah just checked again the system.* and it seems that in order to be aligned with that it should becontainer.cpu.logical.count
(fromsystem.cpu.logical.count
), right? @dmitryax wdyt?@aboguszewski-sumo my previous comment was wrong, you can ignore it. The correct name we should use here is the
container.cpu.logical.count
.The
container.cpu.logical_number
should be sth different similar to thesystem.cpu.logical_number
. It is the "core ID" from which the metrics are emitted. Sorry for the confusion here.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed, thanks for responding quickly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmitryax could you rereview? 🙏