Skip to content

Add oomkilled reason #8709

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 20, 2025
Merged

Add oomkilled reason #8709

merged 1 commit into from
May 20, 2025

Conversation

infernus01
Copy link
Contributor

@infernus01 infernus01 commented Apr 15, 2025

Changes

This PR improves the error reporting for TaskRuns that fail due to Out of Memory (OOM) conditions. The changes modify the extractContainerFailureMessage function in pkg/pod/status.go to include the termination reason in the failure message when a container is OOMKilled.
Fix jira #SRVKP-7343

Submitter Checklist

As the author of this PR, please check off the items in this checklist:

  • Has Docs if any changes are user facing, including updates to minimum requirements e.g. Kubernetes version bumps
  • Has Tests included if any functionality added or changed
  • pre-commit Passed
  • Follows the commit message standard
  • Meets the Tekton contributor standards (including functionality, content, code)
  • Has a kind label. You can add one by adding a comment on this PR that contains /kind bug.
  • Release notes contains the string "action required" if the change requires additional action from users switching to the new release
  • Release notes block below has been updated with any user facing changes (API changes, bug fixes, changes requiring upgrade notices or deprecation warnings). See some examples of good release notes.

Release Notes

TaskRuns that fail due to Out of Memory (OOM) conditions will now show the termination reason in their failure message.

@tekton-robot tekton-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Apr 15, 2025
@infernus01
Copy link
Contributor Author

/cc @aThorp96

@tekton-robot
Copy link
Collaborator

@infernus01: GitHub didn't allow me to request PR reviews from the following users: aThorp96.

Note that only tektoncd members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @aThorp96

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@infernus01
Copy link
Contributor Author

/cc @PuneetPunamiya

@tekton-robot
Copy link
Collaborator

@infernus01: GitHub didn't allow me to request PR reviews from the following users: PuneetPunamiya.

Note that only tektoncd members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @PuneetPunamiya

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@aThorp96
Copy link
Contributor

aThorp96 commented May 9, 2025

/kind bug

@tekton-robot tekton-robot added the kind/bug Categorizes issue or PR as related to a bug. label May 9, 2025
@aThorp96
Copy link
Contributor

aThorp96 commented May 9, 2025

/cc @waveywaves

@tekton-robot tekton-robot requested a review from waveywaves May 9, 2025 19:47
@waveywaves
Copy link
Member

/ok-to-test

@tekton-robot tekton-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label May 12, 2025
@waveywaves
Copy link
Member

/retest

1 similar comment
@waveywaves
Copy link
Member

/retest

@waveywaves
Copy link
Member

@infernus01 a bunch of tests are failing, if you could just check that would be great

@infernus01
Copy link
Contributor Author

/retest

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/pod/status.go 92.2% 92.3% 0.0

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/pod/status.go 92.2% 92.3% 0.0

@@ -1468,7 +1468,7 @@ func TestMakeTaskRunStatus(t *testing.T) {
}},
},
want: v1.TaskRunStatus{
Status: statusFailure(v1.TaskRunReasonFailed.String(), "\"step-one\" exited with code 137"),
Status: statusFailure(v1.TaskRunReasonFailed.String(), "\"step-one\" exited with code 137: OOMKilled"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Welp, this would explain why the CI was failing haha

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/pod/status.go 92.2% 92.3% 0.0

@infernus01
Copy link
Contributor Author

/retest

@waveywaves
Copy link
Member

/retest

@aThorp96
Copy link
Contributor

/lgtm

@tekton-robot
Copy link
Collaborator

@aThorp96: changing LGTM is restricted to collaborators

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@tekton-robot tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 20, 2025
Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label May 20, 2025
@tekton-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aThorp96, vdemeester, waveywaves

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [vdemeester,waveywaves]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tekton-robot tekton-robot merged commit 3287216 into tektoncd:main May 20, 2025
20 checks passed
@waveywaves
Copy link
Member

@infernus01 one thing I forgot to enforce here is the commit message standard, please remember to add proper commit messages in the next one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/bug Categorizes issue or PR as related to a bug. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

5 participants