-
Notifications
You must be signed in to change notification settings - Fork 2.7k
[receiver/kubeletstats] Collect network metrics from all interfaces #34287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
535a90d
to
d14d4fc
Compare
Question if this shouldn't be marked as breaking change? If you relied previously on this metric in some dashboard / query then I think it will break, since it will now return time series per interface? |
Hmm 🤔 , I think it shouldn't be an issue since the |
for i := range s.Interfaces { | ||
recordNetworkDataPoint(mb, networkMetrics.IO, &s.Interfaces[i], getNetworkIO, currentTime) | ||
recordNetworkDataPoint(mb, networkMetrics.Errors, &s.Interfaces[i], getNetworkErrors, currentTime) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If s.Interfaces
contains more than just eth0
, then this will cause the receiver to produce more datapoints. While the interface
attribute was always present, it has historically only ever had one value. Now it will have more, so the cardinality of the metric effectively increases as does the amount of datapoints it produces. This increase in data may be unexpected for new users.
I believe we need to put this behind a feature gate or a config option so that users are not surprised.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thank's for the context. A feature gate sounds good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking of it again, I realize that maybe this change cannot be a drop-in replacement.
Some users might already be happy with the default network metrics, so adding metrics for all the interfaces by-default might be problematic.
I think making this functionality configurable makes more sense (compared to going with a feature flag).
So instead of the feature flag I think introducing 2 new settings would provide better flexibility here:
add_node_interfaces_metrics: true|false
add_pod_interface_metrics: true|false
or
add_interfaces_metrics:
node: true|false
pod: true|false
The default behavior will continue to be as is today, collecting only metrics for the default network interface. By enabling the "interface_metrics" settings users will be explicitly making the receiver to include the additional metrics for all the interfaces.
Also providing the node/pod level granularity will be useful since users might want to avoid collecting all Pod's interfaces metrics when Pods are in hostNetwork: true
mode (as mentioned at #30196 (comment)).
@TylerHelmuth @povilasv let me know what you think here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just add feature flag, if people don't want additional interfaces they can write some OTTL to drop unneeded ones / keep only default one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok! My only concern is about the hostNetwork
case but eventually if a Pod is plugged to the host network it makes sense to observe "its" traffic anyways.
I have put the change behind a feature flag. Let me know you think folks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
957a0b3
to
6faabb4
Compare
Signed-off-by: ChrsMark <[email protected]> Co-authored-by: jinja2 <[email protected]>
5486580
to
17f7b65
Compare
Signed-off-by: ChrsMark <[email protected]>
17f7b65
to
fea37cb
Compare
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
/unstale feature gate added in fea37cb |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
Please resolve conflicts? |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
Closed as inactive. Feel free to reopen if this PR is still being worked on. |
…38737) #### Description Revives #34287 by picking it's changes. Also changed changed approach from feature gate to configuration parameters (disabled by default). In context of network metrics - default behavior of component hasn't changed, so I consider it as a non-breaking change. #### Link to tracking issue Fixes #30196 #### Testing Unit test were updated to cover new functionality #### Documentation Documentation updated to cover new opt-in functionality Original authors are included in the commit: Co-authored-by: ChrsMark <[email protected]> Co-authored-by: jinja2 <[email protected]> --------- Co-authored-by: Christos Markou <[email protected]> Co-authored-by: Jina Jain <[email protected]>
Description:
Revives #30626 by picking its changes (I have included the original author in the commit as suggested at #30626 (comment)).
Link to tracking Issue: Fixes #30196 #33993
Testing: Updated
Documentation: ~
Signed-off-by: ChrsMark [email protected]
Co-authored-by: jinja2 [email protected]