-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 Fix validation of worker topology names in Cluster resource #12069
base: main
Are you sure you want to change the base?
🐛 Fix validation of worker topology names in Cluster resource #12069
Conversation
…etes resource name The worker topology name is used to generate the name of a Kubernetes resource (MachineDelpoyment or MachinePool), and must therefore be a valid Kubernetes resource name.
Skipping CI for Draft Pull Request. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
When I authored the fix, I noticed a disagreement between the validation and the API types. I will keep this as a draft PR until we resolve the disagreement. The API types say that a worker topology Name may be up to 255 characters:-
However, both MachineDeployment and MachinePool validation limits the name to 63 characters, because it checks that the name is a valid label value:
My draft PR replaces this value check with a stricter one, but the length is not changed. As an aside, I noticed that our validation imposes this 63 character limit on the Cluster and MachineDeployment names:
|
When I introduced The MD webhook was not a factor there because we limit the name actually used for the MD here:
As we already had that validation in the webhook. Let's reduce the MaxLengths accordingly |
@@ -310,7 +310,8 @@ func MachineDeploymentTopologiesAreValidAndDefinedInClusterClass(desired *cluste | |||
machineDeploymentClasses := mdClassNamesFromWorkerClass(clusterClass.Spec.Workers) | |||
names := sets.Set[string]{} | |||
for i, md := range desired.Spec.Topology.Workers.MachineDeployments { | |||
if errs := validation.IsValidLabelValue(md.Name); len(errs) != 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that a label value is a DNS1123Label prefixed optionally by a DNS1123Subdomain and a /
.
We are asserting then that no one has used the prefix since /
is not a valid character in a CR name?
Any value in adding a ratcheting validation here to be safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that a label value is a DNS1123Label prefixed optionally by a DNS1123Subdomain and a /.
I think this is incorrect. I created https://go.dev/play/p/NJ9uywgss1N to demonstrate.
Are you thinking of a label key?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I see correctly this PR goes from
[a valid label must be an empty string or consist of alphanumeric characters, '-', '_' or '.', and must start and end with an alphanumeric character (e.g. 'MyValue', or 'my_value', or '12345', regex used for validation is '(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?')]
to
[a lowercase RFC 1123 label must consist of lower case alphanumeric characters or '-', and must start and end with an alphanumeric character (e.g. 'my-name', or '123-abc', regex used for validation is 'a-z0-9?')]
Which seems okay given that the MD/MP names will have to pass through the latter validation anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. @dlipovetsky I think this is maybe the wrong one. I tested this
k apply -f ./test.yaml
The MachineDeployment "capiTest" is invalid: metadata.name: Invalid value: "capiTest": a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is 'a-z0-9?(.a-z0-9?)*')
Should we use IsDNS1123Subdomain
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you thinking of a label key?
Yes I was, ignore me!
Should we use IsDNS1123Subdomain instead?
Subdomain doesn't allow underscores? Which I think the current validation does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added more tests to reflect the above. Now we test that validation fails if
- name is longer than the longest allowed label value
- name is not a valid label value
- name is not a valid resource name
I also updated the implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if you validated that the max length was 63 chars (could do this at the API schema level), then you don't need the IsValidLabelValue
check at all. The regex for DNS subdomain is a subset (the same but don't allow _
), isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if you validated that the max length was 63 chars (could do this at the API schema level)
That's exactly what I did, just in a separate PR: #12072
you don't need the IsValidLabelValue check at all ... The regex for DNS subdomain is a subset
That's true.
a valid label must be an empty string or consist of alphanumeric characters, '-', '_' or '.', and must start and end with an alphanumeric character
a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character
I think having both IsDNS1123Subdomain
and IsValidLabelValue
in code communicates the intent of the code clearly. But I agree that, in practice, IsValidLabelValue
would always return true, because any invalid character would be caught by IsDNS1123Subdomain
, and an invalid length would be caught by the CRD validation.
Also, if we check for max length in the CRD validation, then to test this function, we need the API server, and must use envtest.
If you feel strongly, I can remove the IsValidLabelValue
call, but in that case, I would like to replace it with an ad-hoc length check in the code, so we can continue to unit test the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a strong opinion, but maybe worth a code comment to explain that we know it's redundant.
I do wonder of the performance impact 🤔
Over time, once the CEL format libraries are in our minimum supported version, I suspect we can rip all of this out and use the CEL format to validate that this is a DNS1123 subdomain and be done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe worth a code comment to explain that we know it's redundant.
Comment added.
…bernetes resource names Use IsDNS1123Subdomain
…bernetes resource names Check that Name is both a valid Kubernetes resource name, and a valid label value
… Kubernetes resource name Add tests for maximum length and invalid characters in a label value
… Kubernetes resource name Test for max length should fail due to max length, not due to uppercase characters
…bernetes resource names Explain why we use IsValidLabelValue check
… Kubernetes resource name Add tests to check package
What this PR does / why we need it:
The worker topology name is used to generate the name of a Kubernetes resource (MachineDelpoyment or MachinePool), and must therefore be a valid Kubernetes resource name. The existing validation does not ensure this.
The first commit adds tests; without a fix, they fail, as expected.
The second commit fixes the validation.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #12068
/area clusterclass