Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRI changes to support in-place pod resize #111645

Merged

Conversation

vinaykul
Copy link
Contributor

@vinaykul vinaykul commented Aug 2, 2022

What type of PR is this?

/kind feature
/kind api-change

What this PR does / why we need it:

It implements CRI API changes to support In-Place Pod Vertical Scaling. It is needed to unblock the implementation of CRI support needed for the main feature to work end-to-end.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

The changes in here are extraction of CRI changes from the main PR #102884

Does this PR introduce a user-facing change? Yes

- Extends ContainerStatus CRI API to allow runtime response with container resource requests and limits that are in effect.
- UpdateContainerResources CRI API now supports both Linux and Windows.

For details, see KEPs below.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

- [KEP]: https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/1287-in-place-update-pod-resources
- [Usage]: CRI

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Aug 2, 2022
@k8s-ci-robot
Copy link
Contributor

@vinaykul: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Aug 2, 2022
@k8s-ci-robot
Copy link
Contributor

Hi @vinaykul. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/kubelet labels Aug 2, 2022
@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Aug 2, 2022
@vinaykul
Copy link
Contributor Author

vinaykul commented Aug 2, 2022

/sig-node
/assign @derekwaynecarr @mrunalp @mikebrow @haircommander
/ok-to-test

@k8s-ci-robot
Copy link
Contributor

@vinaykul: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/sig-node
/assign @derekwaynecarr @mrunalp @mikebrow @haircommander
/ok-to-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@haircommander
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Aug 2, 2022
klog.V(10).InfoS("[RemoteRuntimeService] UpdateContainerResources", "containerID", containerID, "timeout", r.timeout)
ctx, cancel := getContextWithTimeout(r.timeout)
defer cancel()

if r.useV1API() {
_, err = r.runtimeClient.UpdateContainerResources(ctx, &runtimeapi.UpdateContainerResourcesRequest{
ContainerId: containerID,
Linux: resources,
Linux: resources.GetLinux(),
Windows: resources.GetWindows(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks to me the ContainerResources object declared below is unused. I think we should specify Linux, Windows and ContainerResources{Linux, Windows} to allow for some backwards compatibliity for containerd (whose old version expect just the Linux and Windows fields, and not yet the ContainerResources field). Am I interpreting this right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, originally I had a separate ContainerResources with Linux and windows - it is still in the KEP , but WindowsContainerResources was added to UCRR and made that unnecessary, and I removed it recently after @mikebrow spotted it Please see comment

Did I misunderstand you?

})
} else {
_, err = r.runtimeClientV1alpha2.UpdateContainerResources(ctx, &runtimeapiV1alpha2.UpdateContainerResourcesRequest{
ContainerId: containerID,
Linux: v1alpha2LinuxContainerResources(resources),
Linux: v1alpha2LinuxContainerResources(resources.GetLinux()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We updated v1alpha2 with the new ContainerResources structure, so I think we should do the same we do above. If we don't update v1alpha2, then we can leave this as it is now IMO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, adding..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no conversion function for v1alpha2LinuxContainerResources.
Is it wise to just add an unsafe pointer conversion func? I don't have a way to test/verify.
I'm not very sure about this below. Does it look right to you?

diff --git a/pkg/kubelet/cri/remote/conversion.go b/pkg/kubelet/cri/remote/conversion.go
index 35539397755..120b718cbf5 100644
--- a/pkg/kubelet/cri/remote/conversion.go
+++ b/pkg/kubelet/cri/remote/conversion.go
@@ -113,6 +113,10 @@ func v1alpha2LinuxContainerResources(from *runtimeapi.LinuxContainerResources) *
        return (*v1alpha2.LinuxContainerResources)(unsafe.Pointer(from))
 }
 
+func v1alpha2WindowsContainerResources(from *runtimeapi.WindowsContainerResources) *v1alpha2.WindowsContainerResources {
+       return (*v1alpha2.WindowsContainerResources)(unsafe.Pointer(from))
+}
+
 func v1alpha2ExecRequest(from *runtimeapi.ExecRequest) *v1alpha2.ExecRequest {
        // If this function changes, also adapt the corresponding Exec code in
        // pkg/kubelet/cri/remote/remote_runtime.go
diff --git a/pkg/kubelet/cri/remote/remote_runtime.go b/pkg/kubelet/cri/remote/remote_runtime.go
index a09879b346a..976f845e011 100644
--- a/pkg/kubelet/cri/remote/remote_runtime.go
+++ b/pkg/kubelet/cri/remote/remote_runtime.go
@@ -639,6 +639,7 @@ func (r *remoteRuntimeService) UpdateContainerResources(containerID string, reso
                _, err = r.runtimeClientV1alpha2.UpdateContainerResources(ctx, &runtimeapiV1alpha2.UpdateContainerResourcesRequest{
                        ContainerId: containerID,
                        Linux:       v1alpha2LinuxContainerResources(resources.GetLinux()),
+                       Windows:     v1alpha2WindowsContainerResources(resources.GetWindows()),
                })
        }
        if err != nil {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps @marosset can comment if this is needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah that looks correct. this could also take the approach of other recent CRI changes and just drop v1alpha2 support

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG. Adding with rebase fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like containerd has had support updating Windows resources in UpdateContainerResources since the 1.6 release (which I'm pretty sure still used v1alpha2).
Because of this I think we should include the Windows changes in v1alpha2 and v1.
x-ref containerd/containerd#5778

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the changes I just linked use the v1 API.
I don't think we need to support v1alpha2 but since the changes are already in might as well leave them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove it with the main feature changes in 1.26. I've added a TODO tracking item for this.

@vinaykul
Copy link
Contributor Author

vinaykul commented Aug 2, 2022

/assign @ruiwen-zhao @Random-Liu @thockin @liggitt - please take a quick look, your sharp eyes needed if you can spare them, mine are blurry right now.

This is a drag-and-drop of the CRI changes from #102884

@@ -42,7 +42,7 @@ import (
type ActivePodsFunc func() []*v1.Pod

type runtimeService interface {
UpdateContainerResources(id string, resources *runtimeapi.LinuxContainerResources) error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a blocker but rather just to confirm, changing this parameter from LinuxContainerResources to ContainerResources could be decoupled from the CRI changes right? remote_runtime would just need to call UpdateContainerResources with LinuxContainerResources directly (windows would not be supported tho).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does runtime call this? IIUC, runtime responds to it, and yes it does not have to change. The UpdateContainerResourcesRequest.Linux field will contain what it needs same as before. Windows runtime looks at UpdateContainerResourcesRequest.Windows.

Copy link
Member

@mikebrow mikebrow Aug 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because of the way the changes went in (not changing the CRI api except for the additional status field and some additional context) the client side manager proto should only impact cri-tools implementing the manager and container runtimes doing integration by implementing a manager for test

Copy link
Member

@mikebrow mikebrow Aug 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be good with decoupling the client side manager changes as well.. if desired..

Copy link
Member

@mikebrow mikebrow Aug 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noting windows was there in the request before, but was not being used:

message UpdateContainerResourcesRequest {
     // ID of the container to update.
     string container_id = 1;
    // Resource configuration specific to Linux containers.
    LinuxContainerResources linux = 2;
    // Resource configuration specific to Windows containers.
    WindowsContainerResources windows = 3;
    // Unstructured key-value map holding arbitrary additional information for
    // container resources updating. This can be used for specifying experimental
    // resources to update or other options to use when updating the container.
    map<string, string> annotations = 4;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, runtime responds to it.

That I understand. I was just trying to confirm the parameters here did not need to change.

the client side manager proto should only impact cri-tools implementing the manager and container runtimes doing integration by implementing a manager for test

Gotcha. Yeah in that case this should have very little risk. I am okay with leaving this as the current PR is.

@@ -515,8 +515,10 @@ func (m *manager) updateContainerCPUSet(containerID string, cpus cpuset.CPUSet)
// this patch-like partial resources.
return m.containerRuntime.UpdateContainerResources(
containerID,
&runtimeapi.LinuxContainerResources{
CpusetCpus: cpus.String(),
&runtimeapi.ContainerResources{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note to self: need to carry these param changes forward to the main/containerd/containerd/integration bucket for local testing

@vinaykul
Copy link
Contributor Author

vinaykul commented Aug 2, 2022

/assign @marosset

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 2, 2022
Copy link
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vinaykul vinaykul force-pushed the restart-free-pod-vertical-scaling-cri branch from c6d97b8 to 09fb5da Compare August 2, 2022 22:45
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Aug 2, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mikebrow, mrunalp, vinaykul

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 2, 2022
&runtimeapi.LinuxContainerResources{
CpusetCpus: cpus.String(),
&runtimeapi.ContainerResources{
Linux: &runtimeapi.LinuxContainerResources{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a Windows field in here too for completeness?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mark, thanks for the quick response. Does Windows have an equivalent for Cpuset? Unless I missed something, I don't see an equivalent in WindowsContainerResources.
In any case, we are less than 2 hours away from code freeze and I don't want to reset CI unless I have to. If this can come in with changes for adding windows support in or before beta, lets do it at that point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't an equivalent for cpuset on Windows (today)

@marosset
Copy link
Contributor

marosset commented Aug 2, 2022

CRI changes LGTM.

Had one open question about adding in WindowsContainerResousces tot he fake runtime objects for completely but this is non-blocking.

@haircommander
Copy link
Contributor

LGTM

thanks!

@mrunalp
Copy link
Contributor

mrunalp commented Aug 2, 2022

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 2, 2022
@ruiwen-zhao
Copy link
Contributor

/lgtm

@pacoxu
Copy link
Member

pacoxu commented Aug 3, 2022

Since code freeze is in effect, we need to add the milestone if this should be merged.

@mrunalp
Copy link
Contributor

mrunalp commented Aug 3, 2022

/milestone 1.25

@k8s-ci-robot
Copy link
Contributor

@mrunalp: You must be a member of the kubernetes/milestone-maintainers GitHub team to set the milestone. If you believe you should be able to issue the /milestone command, please contact your Milestone Maintainers Team and have them propose you as an additional delegate for this responsibility.

In response to this:

/milestone 1.25

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mrunalp
Copy link
Contributor

mrunalp commented Aug 3, 2022

@dims could you help add milestone?

@dims
Copy link
Member

dims commented Aug 3, 2022

looks like this had the /approve and /lgtm before the code freeze deadline. so adding milestone to get this in.

/milestone v1.25

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet