New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kubelet: Mark ready condition as false explicitly for terminal pods #110256
kubelet: Mark ready condition as false explicitly for terminal pods #110256
Conversation
3876e5e
to
00c321c
Compare
/retest |
00c321c
to
ac975b4
Compare
/retest |
ac975b4
to
44c16cd
Compare
/cc |
Great find! |
/kind bug |
Thanks Tim! I agree with you, ideally clients should ignore terminal pods, and if the pod is terminal ignore the ready condition in the first place. We can't guarantee that all clients will do that though. However, at least pre-1.22 (before this regression was introduced), clients could assume that terminal pods would always report ready=false, which is no longer the case. This PR aims to revert to that behavior and ensure kubelet never reports an "invalid" status update of
Yup, this doesn't make it any worse for clients ignoring phases but rather brings back the guarantee that terminal pods will always report a ready status of false. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
I'm a big fan of leaving future-me clues.
We should move directories around so that podutil OWNERs don't need API approvers.
@@ -69,6 +71,15 @@ func GenerateContainersReadyCondition(spec *v1.PodSpec, containerStatuses []v1.C | |||
} | |||
} | |||
|
|||
// If the pod phase is failed, explicitly set the ready condition to false for containers since they may be in progress of terminating. | |||
if podPhase == v1.PodFailed { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this not also checking PodSucceeded
? Should we leave a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And why does this not call generateContainersReadyConditionForTerminalPhase(podPhase)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this not also checking PodSucceeded ? Should we leave a comment?
PodSucceded is already handled above on line 66, see here
And why does this not call generateContainersReadyConditionForTerminalPhase(podPhase) ?
I created that helper function to be used in status_manager.go
to be able to handle either phase. I returned the condition directly inline here, since it's what done in the succeeded case above, but I agree with you, we can switch it to use the helper function as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated, based on your feedback and reused generateContainersReadyConditionForTerminalPhase
for both succeeded and failed cases, so the condition generation logic is in one place.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bobbypage, dchen1107, mrunalp, thockin The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold to update based on feedback |
Terminal pods may continue to report a ready condition of true because there is a delay in reconciling the ready condition of the containers from the runtime with the pod status. It should be invalid for kubelet to report a terminal phase with a true ready condition. To fix the issue, explicitly override the ready condition to false for terminal pods during status updates. Signed-off-by: David Porter <david@porter.me>
Use a watch to detect invalid pod status updates in graceful node shutdown node e2e test. By using a watch, all pod updates will be captured while the previous logic required polling the api-server which could miss some intermediate updates. Signed-off-by: David Porter <david@porter.me>
d0cd673
to
b4b338d
Compare
/unhold Pushed an additional revision to fix suggestions here |
Assuming previous LGTM holds after minor changes: /lgtm |
/retest |
…10256-upstream-release-1.23 Automated cherry pick of #110256: kubelet: Mark ready condition as false explicitly for terminal pods
…10256-upstream-release-1.22 Automated cherry pick of #110256: kubelet: Mark ready condition as false explicitly for terminal pods
…10256-upstream-release-1.24 Automated cherry pick of #110256: kubelet: Mark ready condition as false explicitly for terminal pods
Status: v1.ConditionFalse, | ||
Reason: PodCompleted, | ||
} | ||
return generateContainersReadyConditionForTerminalPhase(podPhase) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going back and looking through this, the len(unknownContainers) == 0
above has me worried.
There are two phases in play - the apiserver phase, and the one the Kubelet tracks. Setting phase to terminal on the apiserver is roughly the same outcome as deleting the pod - the kubelet should inexorably converge the actual pod state to stopped. Likewise if the Kubelet observes a failure or evicts the pod, the phase should inexorably converge. However, internal to the Kubelet status loop the phase should match the actual observed state of the pod containers, and so we should never be in a terminal state here without these containers being in some state we recognize (I think).
This is something I want to have laid out in the pod lifecycle clarification KEP so we can look at all the factors and eliminate my "shoulds" from above - it might be that the correct status when succeeded + > 1 unknown container = pod ready = false here, always. But it's super subtle, and we need this sort of stuff in a clear top level doc, not in comment threads in a PR. I'll add it to the list.
Terminal pods may continue to report a ready condition of true because
there is a delay in reconciling the ready condition of the containers
from the runtime with the pod status. It should be invalid for kubelet
to report a terminal phase with a true ready condition. To fix the
issue, explicitly override the ready condition to false for terminal
pods during status updates.
Signed-off-by: David Porter david@porter.me
What type of PR is this?
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #108594
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: