You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The KEP seeks to provide a way to choose correct behavior with how Container Runtimes (Containerd and CRI-O) are applying `SupplementalGroups` to the first container processes. The KEP describes the work needed to be done in Kubernetes or connected projects to make sure customers have a clear migration path - including detection and safe upgrade - if any of their workflows took a dependency on this arguably erroneous behavior.
As described above, how supplemental groups attached to the first container process is complicated and not OCI image spec compliant.
165
260
166
261
Moreover, this causes security considerations as follows. When a cluster enforces some security policy for pods that protects the value of `RunAsGroup` and `SupplementalGroups`, the effect of its enforcement is limited, i.e., cluster users can easily bypass the policy enforcement just by using a custom image. If such a bypass happened, it would be unexpected behavior for most cluster administrators because the enforcement is almost useless. Moreover, the bypass will cause unexpected file access permission. In some use cases, the unexpected file access permission will be a security concern. For example, using `hostPath` volumes could be a severe problem because UID/GIDs matter in accessing files/directories in the volumes.
@@ -254,7 +349,7 @@ message ContainerUser {
254
349
}
255
350
```
256
351
257
-
### User Stories
352
+
### User Stories (Optional)
258
353
259
354
<!--
260
355
Detail the things that people will be able to do if this KEP is implemented.
@@ -263,6 +358,7 @@ the system. The goal here is to make this feel real for users without getting
263
358
bogged down.
264
359
-->
265
360
361
+
266
362
#### Story 1: Deploy a Security Policy to enforce `SupplementalGroupsPolicy` field
267
363
268
364
Assume a multi-tenant kubernetes cluster with `hostPath` volumes below situations:
@@ -294,6 +390,8 @@ As described in [Summary](#summary) section, `alice` can bypass the restriction
294
390
295
391
Please note that a security policy without `supplementalGroupsPolicy` would lead to unexpected groups for the first process in the containers.
296
392
393
+
<!-- #### Story 2 -->
394
+
297
395
### Notes/Constraints/Caveats (Optional)
298
396
299
397
<!--
@@ -325,6 +423,13 @@ Consider including folks who also work outside the SIG or subproject.
325
423
326
424
## Design Details
327
425
426
+
<!--
427
+
This section should contain enough information that the specifics of your
428
+
change are understandable. This may include API specs (though not always
429
+
required) or even code snippets. If there's any ambiguity about HOW your
430
+
proposal will be implemented, this is the place to discuss them.
431
+
-->
432
+
328
433
### Kubernetes API
329
434
330
435
#### SupplementalGroupsPolicy in PodSecurityContext
- When `SupplementalGroupsPolicy=Strict`, groups of the container process must be ones specified by API: <link to test coverage(t.b.d.)>
628
+
- When `SupplementalGroupsPolicy=Merge`, groups of the container process contains both groups specified by API and groups of the primary user from the image: <link to test coverage(t.b.d.)>
629
+
- For running pods, `ContainerStatus.User` contains the correct identities of the containers: <link to test coverage(t.b.d.)>
630
+
- CRI
631
+
- I will also add symmetrical integration tests to https://github.com/kubernetes-sigs/cri-tools
We expect no non-infra related flakes in the last month as a GA graduation criteria.
532
643
-->
533
644
534
-
- <test>: <linktotestcoverage>
645
+
- When creating a Pod with `SupplementalGroupsPolicy=Strict`, the pods will run with only groups specified by API: <link to test coverage(t.b.d.)>
646
+
- When creating a Pod with `SupplementalGroupsPolicy=Merge`, the pods will run with groups specified by API and groups from the image: <link to test coverage(t.b.d.)>
647
+
- When creating a Pod and it starts, each `ContainerStatus.User` contain the correct identities of the containers: <link to test coverage(t.b.d.)>
Below are some examples to consider, in addition to the aforementioned [maturity levels][maturity-levels].
677
+
-->
678
+
679
+
Because this KEP's core implementation(i.e. `SupplementalGroupsPolicy` handling) lies inside of CRI implementations(e.g. containerd, cri-o), the graduation criteria contains the support statuses of the updated CRI by container runtimes.
564
680
565
681
#### Alpha
566
682
567
-
- Feature implemented behind a feature flag
568
-
- Initial e2e tests completed and enabled
683
+
- At least one of the most popular Container Runtimes(e.g. containerd) implements the updated CRI and released
684
+
- Feature implemented behind a feature flag based on the Container Runtime
685
+
- Unit tests and initial e2e tests completed and enabled
569
686
570
687
#### Beta
571
688
572
-
- Gather feedback from developers and surveys
573
-
- Complete features A, B, C
574
-
- Additional tests are in Testgrid and linked in KEP
689
+
-Several popular Container Runtimes(e.g. containerd and cri-o) support the updated CRI and released
690
+
-Fixed reported bugs from the community
691
+
- Additional integration tests and e2e tests are in Testgrid and linked in KEP
575
692
576
693
#### GA
577
694
578
-
- N examples of real-world usage
579
-
- N installs
580
-
- More rigorous forms of testing—e.g., downgrade tests and scalability tests
581
-
- Allowing time for feedback
582
-
583
-
**Note:** Generally we also wait at least two releases between beta and
584
-
GA/stable, because there's no opportunity for user feedback, or even bug reports,
585
-
in back-to-back releases.
586
-
587
-
**For non-optional features moving to GA, the graduation criteria must include
588
-
[conformance tests].**
695
+
- At least one of Container Runtimes which is not based on the classic container, gVisor for example, supports the updated CRI and released
696
+
- Assuming no negative user feedback based on production experience, promote after 2 releases in beta.
697
+
-[conformance tests] are added for `SupplementalGroupsPolicy` and `ContainerStatus.User` APIs
- Announce deprecation and support policy of the existing flag
595
-
- Two versions passed since introducing the functionality that deprecates the flag (to address version skew)
596
-
- Address feedback on usage/changed behavior, provided on GitHub issues
597
-
- Deprecate the flag
598
-
-->
599
-
600
701
### Upgrade / Downgrade Strategy
601
702
602
703
<!--
@@ -626,7 +727,7 @@ enhancement:
626
727
CRI or CNI may require updating that component before the kubelet.
627
728
-->
628
729
629
-
- CRI must support this feature, especially when using `SupplementalGroupsPolicy=IgnoreGroupsInImage`.
730
+
- CRI must support this feature, especially when using `SupplementalGroupsPolicy=Strict`.
630
731
- kubelet must be at least the version of control-plane components.
631
732
632
733
## Production Readiness Review Questionnaire
@@ -687,6 +788,7 @@ well as the [existing list] of feature gates.
687
788
Any change of default behavior may be surprising to users or break existing
688
789
automations, so be extremely careful here.
689
790
-->
791
+
690
792
No. Just introducing new API fields in Pod spec and CRI which does NOT change the default behavior.
691
793
692
794
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
@@ -702,11 +804,11 @@ feature.
702
804
NOTE: Also set `disable-supported` to `true` or `false` in `kep.yaml`.
703
805
-->
704
806
705
-
Yes. It can be disabled after enabled. However, users should pay attention that gids of container processes in pods with `IgnoreGroupsInImage` policy would change. It means the action might break the application in permission. We plan to provide a way for users to detect which pods are affected.
807
+
Yes. It can be disabled after enabled. However, users should pay attention that gids of container processes in pods with `Strict` policy would change. It means the action might break the application in permission. We plan to provide a way for users to detect which pods are affected.
706
808
707
809
###### What happens if we reenable the feature if it was previously rolled back?
708
810
709
-
Just the policy `IgnoreGroupsInImage` is reenabled. Users should pay attention that gids of containers in pods with `IgnoreGroupsInImage` policy would change. It means that the action might break the application in permission. We plan to provide a way for users to detect which pods are affected.
811
+
Just the policy `Stcict` is reenabled. Users should pay attention that gids of containers in pods with `Stcict` policy would change. It means that the action might break the application in permission. We plan to provide a way for users to detect which pods are affected.
710
812
711
813
###### Are there any tests for feature enablement/disablement?
712
814
@@ -919,7 +1021,7 @@ Describe them, providing:
919
1021
- Estimated amount of new objects: (e.g., new Object X for every existing Pod)
920
1022
-->
921
1023
922
-
No.
1024
+
Precisely, yes because the kep introduces new API fields in Pods. But the increasing size can be negligible.
923
1025
924
1026
###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
925
1027
@@ -948,6 +1050,18 @@ This through this both in small and large cases, again with respect to the
948
1050
949
1051
No.
950
1052
1053
+
###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
1054
+
1055
+
<!--
1056
+
Focus not just on happy cases, but primarily on more pathological cases
1057
+
(e.g. probes taking a minute instead of milliseconds, failed pods consuming resources, etc.).
1058
+
If any of the resources can be exhausted, how this is mitigated with the existing limits
1059
+
(e.g. pods per node) or new limits added by this KEP?
1060
+
1061
+
Are there any tests that were run/should be run to understand performance characteristics better
1062
+
and validate the declared limits?
1063
+
-->
1064
+
951
1065
### Troubleshooting
952
1066
953
1067
<!--
@@ -999,6 +1113,8 @@ Major milestones might include:
999
1113
Why should this KEP _not_ be implemented?
1000
1114
-->
1001
1115
1116
+
N/A
1117
+
1002
1118
## Alternatives
1003
1119
1004
1120
<!--
@@ -1027,4 +1143,4 @@ new subproject, repos requested, or GitHub details. Listing these here allows a
1027
1143
SIG to get the process for these resources started right away.
0 commit comments