Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Append new pod conditions when deleting pods to indicate the reason for pod deletion #110959

Merged

Conversation

mimowo
Copy link
Contributor

@mimowo mimowo commented Jul 5, 2022

What type of PR is this?

/kind feature

What this PR does / why we need it:

This PR appends pod conditions with a dedicated pod condition type DisruptionTarget.
Its reason field indicates the reason for pod termination:

  • PreemptionByKubeScheduler (Pod preempted by kube-scheduler)
  • DeletionByTaintManager (Pod deleted by taint manager due to NoExecute taint)
  • EvictionByEvictionAPI (Pod evicted by Eviction API)
  • DeletionByPodGC (an orphaned Pod deleted by PodGC)

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Yes, it appends new pod conditions when deleting a pod.

Introduction of the `DisruptionTarget` pod condition type. Its `reason` field indicates the reason for pod termination:
- PreemptionByKubeScheduler (Pod preempted by kube-scheduler)
- DeletionByTaintManager (Pod deleted by taint manager due to NoExecute taint)
- EvictionByEvictionAPI (Pod evicted by Eviction API)
- DeletionByPodGC (an orphaned Pod deleted by PodGC)

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

- [KEP]: https://github.com/kubernetes/enhancements/tree/master/keps/sig-apps/3329-retriable-and-non-retriable-failures

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 5, 2022
@k8s-ci-robot
Copy link
Contributor

Hi @mimowo. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-priority Indicates a PR lacks a `priority/foo` label and requires one. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. labels Jul 5, 2022
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jul 5, 2022
@alculquicondor
Copy link
Member

/cc

pkg/apis/core/types.go Outdated Show resolved Hide resolved
pkg/scheduler/util/utils.go Outdated Show resolved Hide resolved
pkg/scheduler/util/utils.go Outdated Show resolved Hide resolved
pkg/scheduler/util/utils.go Outdated Show resolved Hide resolved
pkg/controller/util/node/controller_utils.go Outdated Show resolved Hide resolved
pkg/controller/util/node/controller_utils.go Outdated Show resolved Hide resolved
pkg/controller/podgc/gc_controller.go Outdated Show resolved Hide resolved
pkg/controller/util/node/controller_utils.go Outdated Show resolved Hide resolved
pkg/registry/core/pod/storage/eviction.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added area/test sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Jul 6, 2022
@mimowo mimowo force-pushed the retriable-pod-failures-pod-conditions branch 3 times, most recently from 1bd702d to ea8da83 Compare July 6, 2022 14:04
pkg/api/v1/pod/util.go Outdated Show resolved Hide resolved
pkg/scheduler/util/utils.go Outdated Show resolved Hide resolved
pkg/controller/podgc/gc_controller.go Outdated Show resolved Hide resolved
pkg/controller/podgc/gc_controller.go Outdated Show resolved Hide resolved
pkg/controller/podgc/gc_controller.go Outdated Show resolved Hide resolved
pkg/controller/podgc/gc_controller.go Outdated Show resolved Hide resolved
pkg/scheduler/util/utils.go Outdated Show resolved Hide resolved
@mimowo mimowo force-pushed the retriable-pod-failures-pod-conditions branch from 3d37248 to 55654ec Compare July 7, 2022 13:26
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 8, 2022
@mimowo mimowo force-pushed the retriable-pod-failures-pod-conditions branch from 5579cc0 to 49f995c Compare July 8, 2022 13:49
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 2, 2022
@mimowo
Copy link
Contributor Author

mimowo commented Aug 2, 2022

/retest

@mimowo mimowo force-pushed the retriable-pod-failures-pod-conditions branch from 51327eb to 00d23f7 Compare August 2, 2022 07:56
@mimowo mimowo force-pushed the retriable-pod-failures-pod-conditions branch from 00d23f7 to 63cd724 Compare August 2, 2022 08:03
@aojea
Copy link
Member

aojea commented Aug 2, 2022

I left the hold since I couldn't tell if @aojea's comments on the tests were blocking

lgtm

…on` field indicates the reason:

- PreemptionByKubeScheduler (Pod preempted by kube-scheduler)
- DeletionByTaintManager (Pod deleted by taint manager due to NoExecute taint)
- EvictionByEvictionAPI (Pod evicted by Eviction API)
- DeletionByPodGC (an orphaned Pod deleted by PodGC)PreemptedByScheduler (Pod preempted by kube-scheduler)
Copy link
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/hold cancel
since Antonio gave his 👍 above

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Aug 2, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alculquicondor, liggitt, mimowo, soltysh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@alculquicondor
Copy link
Member

/priority important-soon

@k8s-ci-robot k8s-ci-robot added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Aug 2, 2022
@k8s-ci-robot k8s-ci-robot merged commit fa202f1 into kubernetes:master Aug 2, 2022
SIG Node CI/Test Board automation moved this from PRs - Needs Approver to Done Aug 2, 2022
SIG Node PR Triage automation moved this from Needs Reviewer to Done Aug 2, 2022
@mimowo mimowo deleted the retriable-pod-failures-pod-conditions branch September 30, 2022 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Status: API review completed, 1.25
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet