kube_pod_status_reason is 0 for all reasons #2612

dshackith · 2025-02-18T15:41:48Z

What happened:
The metric kube_pod_status_reason shows 0 for all reasons, even when reasons should have value of 1.

What you expected to happen:
We use Karpenter in our clusters, and expect to be able to see when pods have a change in status based on actions Karpenter takes. In particular, we expect to see Evicted, NodeLost, and Shutdown reasons to show a value of 1 in clusters where consolidation is happening all the time (consolidateAfter value is 5m0s). We can see in our Karpenter metrics that at any given time, some pod is being moved, and should show up with a kube_pod_status_reason of Evicted with a value of 1.

How to reproduce it (as minimally and precisely as possible):
This prometheus query: sum(kube_pod_status_reason) by (reason) shows 0 for every reason, and when charted, those value remain the same over any time interval.

Anything else we need to know?:
The kube_pod_status_phase does not give use the information we need (specific reasons for status), and no other metric claims to provide this.

Environment:
Running KSM v2.13 managed via Helm chart
EKS v1.32.2
Karpenter v1.2.0

The text was updated successfully, but these errors were encountered:

dshackith · 2025-02-18T15:49:09Z

See also these issues where it was raised, but not resolved:
#2116
#1843

konstantindobroliubov · 2025-02-19T12:32:48Z

Makes sense to mention the version of kube-state-metrics that you used.
I face the same with one of the recent versions. Upgrading to the most fresh to be 100% sure.

dshackith · 2025-02-19T13:34:02Z

Makes sense to mention the version of kube-state-metrics that you used.

Running KSM v2.13 managed via Helm chart

konstantindobroliubov · 2025-02-19T14:31:02Z

Running KSM v2.13 managed via Helm chart

Sorry, I'm blind. Didn't correlate the KSM followed by EKS with "kube-state-metrics".
Tried with 2.14. The same result.
Manually evicted a few Pods by draining the Node where they were placed. There's an Event about eviction. Metric kube_pod_status_reason{} always returns 0 for all Pods.

mrueg · 2025-02-19T15:28:44Z

If it's 0 for all, then https://github.com/kubernetes/kube-state-metrics/blob/main/internal/store/pod.go#L1547 the comparison here might not be correct.

dshackith · 2025-02-19T19:32:22Z

kubectl get pods  -o json | jq -r '.items[] | select(.status.conditions[]?.type == "DisruptionTarget") | "\(.metadata.name)\t\(.status.conditions[] | select(.type == "DisruptionTarget") | .type)\t\(.status.conditions[] | select(.type == "DisruptionTarget") | .reason)\t\(.status.conditions[] | select(.type == "DisruptionTarget") | .message)"'

art-aa-service-6fd747848f-4vczd	DisruptionTarget	EvictionByEvictionAPI	Eviction API: evicting

In the spec for the pod I don't see something like pod.status.terminated.reason or pod.status.reason. I do see an array of items in pod.status.conditions which includes a .type, .reason, and .message, and I do see pod.status.containerStatuses[].state.terminated.reason.

richabanker · 2025-02-20T17:53:41Z

/triage accepted
/assign @mrueg

konstantindobroliubov · 2025-03-31T13:39:18Z

It's more than a month since it was accepted for the triage. Any updates on this?

mrueg · 2025-03-31T13:52:58Z

I've pretty much described where setting it to 0 is coming from, feel free to take a look into this and come up with a solution: #2612 (comment)

/help

k8s-ci-robot · 2025-03-31T13:53:01Z

@mrueg:
This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

Why are we solving this issue?
To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
Does this issue have zero to low barrier of entry?
How can the assignee reach out to you for help?

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

I've pretty much described what's needed to change here, feel free to take a look into this and come up with a solution: #2612 (comment)

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

carlosmorenokm1 · 2025-04-01T03:05:21Z

/assign

dshackith added the kind/bug Categorizes issue or PR as related to a bug. label Feb 18, 2025

k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Feb 18, 2025

k8s-ci-robot assigned mrueg Feb 20, 2025

k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Feb 20, 2025

k8s-ci-robot added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Mar 31, 2025

k8s-ci-robot assigned carlosmorenokm1 Apr 1, 2025

carlosmorenokm1 linked a pull request Apr 1, 2025 that will close this issue

fix: report correct reason in kube_pod_status_reason metric #2644

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kube_pod_status_reason is 0 for all reasons #2612

kube_pod_status_reason is 0 for all reasons #2612

dshackith commented Feb 18, 2025 •

edited

Loading

dshackith commented Feb 18, 2025

konstantindobroliubov commented Feb 19, 2025

dshackith commented Feb 19, 2025

konstantindobroliubov commented Feb 19, 2025 •

edited

Loading

mrueg commented Feb 19, 2025

dshackith commented Feb 19, 2025 •

edited

Loading

richabanker commented Feb 20, 2025

konstantindobroliubov commented Mar 31, 2025

mrueg commented Mar 31, 2025 •

edited

Loading

k8s-ci-robot commented Mar 31, 2025

carlosmorenokm1 commented Apr 1, 2025

kube_pod_status_reason is 0 for all reasons #2612

kube_pod_status_reason is 0 for all reasons #2612

Comments

dshackith commented Feb 18, 2025 • edited Loading

dshackith commented Feb 18, 2025

konstantindobroliubov commented Feb 19, 2025

dshackith commented Feb 19, 2025

konstantindobroliubov commented Feb 19, 2025 • edited Loading

mrueg commented Feb 19, 2025

dshackith commented Feb 19, 2025 • edited Loading

richabanker commented Feb 20, 2025

konstantindobroliubov commented Mar 31, 2025

mrueg commented Mar 31, 2025 • edited Loading

k8s-ci-robot commented Mar 31, 2025

Guidelines

carlosmorenokm1 commented Apr 1, 2025

dshackith commented Feb 18, 2025 •

edited

Loading

konstantindobroliubov commented Feb 19, 2025 •

edited

Loading

dshackith commented Feb 19, 2025 •

edited

Loading

mrueg commented Mar 31, 2025 •

edited

Loading