New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AzureFile: Volume without secretNamespace fails to mount after translating to CSI #108000
Conversation
Welcome @RomanBednar! |
Hi @RomanBednar. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
for PV, if kubernetes/pkg/volume/azure_file/azure_file.go Lines 373 to 394 in 5340ae0
What's your k8s version? seems I was wrong, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/ok-to-test
/priority important-soon |
No, it does not use namespace named "default". It uses
|
exactly, and when the translation happens at a point where pod does not exist yet - it makes sense to try get the namespace from ClaimRef I believe. Maybe I should have also mentioned that the issue occurs only when using a PV without namespace while scheduling a pod in non-default namespace. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/retest |
/retest |
1 similar comment
/retest |
retest |
/retest |
1 similar comment
/retest |
@jsturtevant @CecileRobertMichon azure tests are still failing now. |
cf4a37f
to
eebf761
Compare
/retest |
@@ -129,9 +129,17 @@ func (t *azureFileCSITranslator) TranslateInTreePVToCSI(pv *v1.PersistentVolume) | |||
resourceGroup = v | |||
} | |||
} | |||
|
|||
// Secret is required when mounting a volume but pod presence cannot be assumed - we should not try to read pod now. | |||
namespace := defaultSecretNamespace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove this default line? It doesn't look like there is a path for intree PV to use the "default" namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, intree does not use "default". So to keep it consistent we should return an error here as well if the namespace can not be found. See if the latest version of the patch would be ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if pv.Spec.ClaimRef.Namespace
is empty? I think it's better not return error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In CSI driver emtpy equals default
as you mentioned previously, and I think Michelle wants to prevent using default
as this might point to a secret in different namespace. So I think the only option here is to return error because we can't use any reasonable default.
Also, how possible is it to have pv.Spec.ClaimRef.Namespace
empty? It should not be possible to use/mount a PV without PVC correct? So ideally the error should never be returned, but if it ever happens at least we know it's not possible to continue.
And if we don't want to return an error, what should the namespace be set to? We can't leave it empty due to CSI driver interpreting it as default
, we can't set it to nil because that might break volumeID now (namespace was added recently #107575)), and we can't just leave it out because pv would never mount (without a secret).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory, for all PV operations PV.claimRef
must be set (i.e. a PV must be bound).
Attach/detach/mount/unmount work with PVCs, i.e. the PV they get must be bound.
Provisioner creates PVs with claimRef
populated.
Deletetion works when a PV has a claimRef
set and the referenced PVC does not exist.
PVs without a claimRef
must have been created manually and are not bound. Does Kubernetes do anything witch such volumes that needs translation? Did I miss any exotic error case?
/retest |
When translating InTree pv to CSI pv we use default secret namespace when it's not found in the InTree pv. Using the default is not ideal for several reasons: 1) it can result in failed pod creation after users migrate to cluster with CSI enabled because the existing intree pvs might not have the namespace defined. In that case the "default" is used and mount fails because secret could not be found. 2) falling back to "default" namespace can result in referencing a secret from different namespace which is a security risk However, there is another object we can use to determine correct namespace which presence can be safely assumed - ClaimRef. Mounting a volume is done only through a PVC which is bound. Binding adds ClaimRef to PV and finally the volume gets mounted which is where the translation code is used.
This is the actual fix - attempt to obtain a namespace from ClaimRef. Or fail if namespace could not be found instead of using "default".
eebf761
to
507c63c
Compare
/retest |
@msau42 could you approve this PR? Thanks. |
/lgtm Thanks! |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andyzhangx, msau42, RomanBednar The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…108000-upstream-release-1.22 Automated cherry pick of #108000: azure_file: add namespace tests for InTree to CSI
…108000-upstream-release-1.23 Automated cherry pick of #108000: azure_file: add namespace tests for InTree to CSI
…108000-upstream-release-1.21 Automated cherry pick of #108000: azure_file: add namespace tests for InTree to CSI
What type of PR is this?
/kind bug
What this PR does / why we need it:
This PR fixes a bug which can impact users upon enabling CSI migration. Without the bug fix all pods that use claims which reference an existing InTree PV without namespace defined will cause existing pods to get stuck in ContainerCreating status right after enabling CSI migration.
Which issue(s) this PR fixes:
Volume without pv.spec.azureFile.secretNamespace could not be mounted due to the secret not found causing pod to fail after CSI migration is enabled.
Special notes for your reviewer:
See verification steps below.
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: