Added changes to support alternative recommender #4131

wangchen615 · 2021-06-10T06:33:57Z

This update is to support customized recommender in VPA.
The KEP is here: https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler/enhancements/3919-customized-recommender-vpa

resolve #3913
close #3913

bskiba · 2021-06-10T13:59:57Z

@kgolab for visibility

vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go

vertical-pod-autoscaler/examples/hamster-default-recommender.yaml

vertical-pod-autoscaler/pkg/recommender/input/cluster_feeder.go

kgolab · 2021-06-18T10:35:11Z

vertical-pod-autoscaler/pkg/recommender/model/cluster.go

@@ -38,6 +38,8 @@ const (
 // events (container OOMs).
 // All input to the VPA Recommender algorithm lives in this structure.
 type ClusterState struct {
+	// Denote the recommender name
+	RecommenderName string


As commented above I don't yet understand why we need to store this dynamically in the default recommender.

@kgolab, as I mentioned, I hope it is easier to reuse in other alternative recommenders' codes.

Please split this change (adding RecommenderName field) to a separate PR.

It will make this PR easier to review. It will also make it easier to see how we'd use the field in alternative recommenders

Please move this change (adding RecommenderName field) to a separate PR.

It will make this PR easier to review. It will also make it easier to see how we'd use RecommenderName field

kgolab · 2021-06-18T10:49:18Z

Thank you for the PR.

I'd like to understand if we really need to propagate recommenderName throughout the code of default VPA:

if we don't the change could be simplified to yaml files, types.go and (parts of) cluster_feeder.go,
if we do: could please explain why (so likely how you plan to use the code)?

Looking at the KEP I think we're missing an update in validateVPA. Or have you already decided to support multiple recommenders (changes in cluster_feeder.go might suggest so)?

vertical-pod-autoscaler/examples/hamster-default-recommender.yaml

vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go

vertical-pod-autoscaler/pkg/recommender/input/cluster_feeder.go

vertical-pod-autoscaler/pkg/recommender/checkpoint/checkpoint_writer_test.go

vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go

wangchen615 · 2021-06-22T16:13:57Z

@kgolab Thanks so much for your review.

As I mentioned above, keeping recommenderName will make the default recommender code easier to be reused when others developed their own alternative recommenders.
- Some functions like loadVPAs in reconcilers are definitely reusable.
- However, other recommenders may rely on completely different metric collectors, such as Prometheus, and run different logics to come up with the target, lower bound and upper bound for requests.
- For example, I am writing a customized recommender that will reuse the ClusterStateFeederFactory but rewrite the metric client.
- At the same time, we provide support for the controller-level alternative recommender, so developers also have the freedom to develop their own recommender from scratch.

Thank you for the PR.

I'd like to understand if we really need to propagate recommenderName throughout the code of default VPA:

if we don't the change could be simplified to yaml files, types.go and (parts of) cluster_feeder.go,

if we do: could please explain why (so likely how you plan to use the code)?

Looking at the KEP I think we're missing an update in validateVPA. Or have you already decided to support multiple recommenders (changes in cluster_feeder.go might suggest so)?

Yes, sorry for missing the validateVPA. I added back.

And yes, we want to support multiple VPA recommenders in the future. However, we may need a new KEP on how to cache the RecommendedPodResources for different recommenders in VPAStatus and how to select one recommendation among multiple recommenders when evicting and admitting the pod. We could initiate discussions once this first step PR is merged.

bskiba

Some minor comments on my side. @kgolab waiting for your input on the current approach as explained by Chen

vertical-pod-autoscaler/deploy/vpa-v1-crd-gen.yaml

vertical-pod-autoscaler/examples/hamster.yaml

bskiba · 2021-06-25T15:00:54Z

vertical-pod-autoscaler/pkg/recommender/main.go

@@ -32,6 +32,9 @@ import (
 	"k8s.io/klog"
 )

+// RecommenderName denotes the current recommender name as the "default" one.
+const RecommenderName = "default"


How about DefaultRecommenderName

I think this went back to "RecommenderName" in the meantime, please change it back to DefaultRecommenderName

Jeffwan · 2021-07-01T16:48:00Z

@wangchen615 It would be great to squash your commits. This repo doesn't squash it automatically.

wangchen615 · 2021-08-30T14:05:34Z

@mwielgus, @schylek, @kgolab, @jbartosik, @krzysied, this PR had no new comments for almost 2 months and all comments have been resolved. Could you help approve this?

jbartosik · 2021-08-30T15:07:22Z

@mwielgus, @schylek, @kgolab, @jbartosik, @krzysied, this PR had no new comments for almost 2 months and all comments have been resolved. Could you help approve this?

Sorry for the long delay. I'm looking at this right now

jbartosik

I'm trying to understand this and there is one thing I'm not sure about.

The idea I got during SIG was that the intent of this change is to allow A/B testing of recommenders. So I expected to be able to:

Set two (or more) recommenders that will expose their recommendations (so I can compare those recommendations)
Choose which recommendation to apply (when I see which one looks better)

Instead this looks to me like I can choose one recommender to provide a recommendation (that I will see and which will be applied).

I think I'm confused about something here.

I'll try to figure this out tomorrow. But if can see what I'm getting wrong I'd appreciate help

wangchen615 · 2021-08-31T01:04:54Z

I'm trying to understand this and there is one thing I'm not sure about.

The idea I got during SIG was that the intent of this change is to allow A/B testing of recommenders. So I expected to be able to:

Set two (or more) recommenders that will expose their recommendations (so I can compare those recommendations)

Choose which recommendation to apply (when I see which one looks better)

Instead this looks to me like I can choose one recommender to provide a recommendation (that I will see and which will be applied).

I think I'm confused about something here.

I'll try to figure this out tomorrow. But if can see what I'm getting wrong I'd appreciate help

Hi, @jbartosik , there are two proposals that allow A/B testing eventually in two steps:

Support using one recommender for one VPA object, but the recommender can use a customized one. This is the initial proposal Support Customized Recommenders for Vertical Pod Autoscalers
Support multiple recommenders for a single VPA object. The proposal slides are here.

Once this PR is merged, we will move forward to add the second KEP and create a new PR accordingly.

jbartosik · 2021-08-31T17:57:21Z

@wangchen615 Thank you for the explanation.

I'm not sure if this is the best way to get an API that allows A/B testing of recommenders.

If I understand the proposal in the presentation correctly we'd make a breaking API change (remove VerticalPodAutoscalerStatus.Recommendation field and replace it with VerticalPodAutoscalerStatus.Recommendations). This means we'll need to introduce a new API version. If we do that then v1 API will be a bit weird. It will have a repeated VerticalPodAutoscalerSpec.RecommenderNames field but only one value will matter.

I think it would be better to:

Allow A/B testing without breaking API changes, or
If that's not feasible then in the v1 API support only one recommender in a clearer way. Add a string field VerticalPodAutoscalerSpec.RecommenderName (instead of RecommenderNames []string)

I'll think if 1. is feasible and get back to you later this week.

Also I want to make sure we don't make extending the API too hard for the future contributors. For example if in the future we might want to allow users to use CPU recommendation from one recommender but memory recommendation from another. I'd like to make it possible without a breaking API change.

wangchen615 · 2021-09-01T03:43:47Z

@jbartosik Changing the API from the string of VerticalPodAutoscalerStatus.Recommendation to the list of VerticalPodAutoscalerStatus.Recommendations was exactly what was requested by @bskiba and @kgolab. I believe the code now is supporting a list of VerticalPodAutoscalerStatus.Recommendations as the API but only one recommender for this PR.

In the next proposal where we support multiple recommenders, we do not need to further change the API but only support more than 1 recommender per VPA object.

Supporting CPU and Memory recommendations using different recommenders is a very good new use case to be added in the multiple recommender's support proposal. I can try to add some slides for that as well. However, I believe if the recommender is only updating CPU or memory, the list of recommenders VerticalPodAutoscalerStatus.Recommendations in the API will work. But we need to extend the following field to a map for multiple recommenders' cases.

  // Denote the name of recommender to be in use.
  SelectedRecommender string `json:"selectedRecommender,omitempty" protobuf:"bytes,5,opt,name=selectedRecommender"`

This field will not be introduced until we merge the next KEP, so it is really the next stage of work.

wangchen615 · 2021-09-06T15:17:13Z

@jbartosik , I updated the multiple recommender support slides here: https://docs.google.com/presentation/d/1TtFNsCcEnbKki9xe31E8beNMq0fjm2Ek_-jXiGWst-8/edit?usp=sharing

The AllRecommendations field is introduced in VerticalPodAutoscalerStatus so we do not need to change the existing Recommendation field but can track multiple recommenders' recommendations.

The SelectedRecommender is introduced in VerticalPodAutoscalerSpec so users can denote which recommender's recommendation is selected. If it is not specified, it will either use the default recommender's recommendation or if default not exists, it will use the first recommender specified in the list of RecommenderNames.

@joelsmith , could you also help review?

kgolab · 2021-09-07T13:31:05Z

@jbartosik Changing the API from the string of VerticalPodAutoscalerStatus.Recommendation to the list of VerticalPodAutoscalerStatus.Recommendations was exactly what was requested by @bskiba and @kgolab. I believe the code now is supporting a list of VerticalPodAutoscalerStatus.Recommendations as the API but only one recommender for this PR.

Just double-checking if we're not mixing up two things - IIRC we asked for introducing RecommenderNames (string[]) instead of RecommenderName (string) to support multiple recommenders in the future.

What @jbartosik asked is that we should align this change with what happens to Recommendations field.

When discussing the KEP (#3914) we excluded discussion about multiple recommender / statuses for the time being but with the new proposal you showed (https://docs.google.com/presentation/d/1TtFNsCcEnbKki9xe31E8beNMq0fjm2Ek_-jXiGWst-8/edit?usp=sharing), it's becoming clearer which directions are possible.

It would be great if this PR was aligned with the direction chosen for multiple recommenders, that's why we're trying to address the future proposal (slides) at the same time as the PR, to reduce uncertainty and make sure we don't end up with a half-way API (so either move all the changes to v2 or make them all in v1, compatible with current API).

Finally: I'm sorry for the too-long a delay before my comments - it's been a mix of very busy times for me but this should be over now. Thank you for your patience.

wangchen615 · 2021-09-07T15:22:02Z

@kgolab , yes, that's correct.

Just double-checking if we're not mixing up two things - IIRC we asked for introducing RecommenderNames (string[]) instead of RecommenderName (string) to support multiple recommenders in the future.

Yes, so in the new proposal, we introduce the VerticalPodAutoscalerStatus.AllRecommendations field and VerticalPodAutoscalerSpec.SelectedRecommender field. So VerticalPodAutoscalerStatus.Recommendation will keep the same and will only store the recommendations from VerticalPodAutoscalerSpec.SelectedRecommender. Please refer to the slides. In this way, we can make this PR aligned with future changes.

It would be great if this PR was aligned with the direction chosen for multiple recommenders, that's why we're trying to address the future proposal (slides) at the same time as the PR, to reduce uncertainty and make sure we don't end up with a half-way API (so either move all the changes to v2 or make them all in v1, compatible with current API).

wangchen615 · 2021-09-20T05:06:44Z

@jbartosik , I made the changes according to the suggestions in PR#4329

jbartosik

Please add a test that checks that:

default recommender acts when no recommender is specified
default recommender acts when it's explicitly configured
default recommender doesn't act when a different recommender is configured

Also our test dashboards are currently broken, please run e2e tests (recommender and full suites should be enough). Please use K8s older than 1.23 (dashboard are red because of problems in 1.23. (only full and recommender tests are broken so there is no need to run other test suites)

EDIT: link to the dashboard

jbartosik

Some more comments

vertical-pod-autoscaler/deploy/vpa-v1-crd-gen.yaml

vertical-pod-autoscaler/examples/hamster.yaml

vertical-pod-autoscaler/pkg/admission-controller/resource/vpa/handler.go

vertical-pod-autoscaler/pkg/recommender/input/cluster_feeder.go

jbartosik · 2021-10-06T08:52:37Z

vertical-pod-autoscaler/pkg/recommender/main.go

@@ -32,6 +32,9 @@ import (
 	"k8s.io/klog"
 )

+// RecommenderName denotes the current recommender name as the "default" one.
+const RecommenderName = "default"


I think this went back to "RecommenderName" in the meantime, please change it back to DefaultRecommenderName

jbartosik · 2021-10-06T09:00:35Z

vertical-pod-autoscaler/pkg/recommender/model/cluster.go

@@ -38,6 +38,8 @@ const (
 // events (container OOMs).
 // All input to the VPA Recommender algorithm lives in this structure.
 type ClusterState struct {
+	// Denote the recommender name
+	RecommenderName string


Please split this change (adding RecommenderName field) to a separate PR.

It will make this PR easier to review. It will also make it easier to see how we'd use the field in alternative recommenders

jbartosik · 2021-10-06T09:02:14Z

vertical-pod-autoscaler/pkg/recommender/model/cluster.go

@@ -38,6 +38,8 @@ const (
 // events (container OOMs).
 // All input to the VPA Recommender algorithm lives in this structure.
 type ClusterState struct {
+	// Denote the recommender name
+	RecommenderName string


Please move this change (adding RecommenderName field) to a separate PR.

It will make this PR easier to review. It will also make it easier to see how we'd use RecommenderName field

wangchen615 · 2021-10-06T14:17:25Z

@jbartosik Just a clarification question about your comment "Please move this change (adding RecommenderName field) to a separate PR.
It will make this PR easier to review. It will also make it easier to see how we'd use the RecommenderName field".

Do you mean for this PR, we do not pass the RecommenderName through ClusterState struct and only checks if DefaultRecommenderName exists or if VPA objects choose other recommender names?

And in a new PR, we add RecommenderName in ClusterState structs so cluster.go code in default recommender can be reused by others who develop their own recommenders?

jbartosik · 2021-10-07T14:21:07Z

@jbartosik Just a clarification question about your comment "Please move this change (adding RecommenderName field) to a separate PR. It will make this PR easier to review. It will also make it easier to see how we'd use the RecommenderName field".

Do you mean for this PR, we do not pass the RecommenderName through ClusterState struct and only checks if DefaultRecommenderName exists or if VPA objects choose other recommender names?

And in a new PR, we add RecommenderName in ClusterState structs so cluster.go code in default recommender can be reused by others who develop their own recommenders?

I think in this PR we should only have changes that add support for alternative recommenders to the default recommender. And in the follow up PR changes to allow reusing code in alternative recommenders.

So yes to both questions :)

wangchen615 · 2021-10-25T05:16:05Z

Please add a test that checks that:

default recommender acts when no recommender is specified

default recommender acts when it's explicitly configured

default recommender doesn't act when a different recommender is configured

Also our test dashboards are currently broken, please run e2e tests (recommender and full suites should be enough). Please use K8s older than 1.23 (dashboard are red because of problems in 1.23. (only full and recommender tests are broken so there is no need to run other test suites)

EDIT: link to the dashboard

Please see ff85923 for all e2e tests and unit tests added.

jbartosik · 2021-10-25T21:51:00Z

@wangchen615

Thank you.

I think I'm missing something important about the problem with e2e tests.

I tried running e2e tests locally:

I checked out 1c7fa88 locally,
I created a cluster (on GKE, K8s 1.20.10-gke.301, no VPA from GKE so I could setup mine)
I deployed VPA in the cluster ./vertical-pod-autoscaler/hack/deploy-for-e2e.sh full-vpa
I ran e2e tests (in k8s.io/autoscaler) ./vertical-pod-autoscaler/hack/run-e2e.sh full-vpa

And this worked for me. All tests run as expected. Some have failed [1] but failures are things like pods not updating fast enough [2]. This doesn't sound like the problem you described (I think this is me using too small cluster).

I'd like to help. Please share:

Errors you see
How you run the test.

[1]

$ grep \"msg\" 2021-10-25.alt-recommender-e2e.out 
{"msg":"Test Suite starting","total":3,"completed":0,"skipped":0,"failed":0}
{"msg":"PASSED [sig-autoscaling] [VPA] [full-vpa] [v1beta2] OOMing pods under VPA have memory requests growing with OOMs","total":3,"completed":1,"skipped":5,"failed":0}
{"msg":"PASSED [sig-autoscaling] [VPA] [full-vpa] [v1beta2] Pods under VPA have memory requests growing with usage","total":3,"completed":2,"skipped":8,"failed":0}
{"msg":"FAILED [sig-autoscaling] [VPA] [full-vpa] [v1beta2] Pods under VPA have cpu requests growing with usage","total":3,"completed":2,"skipped":22,"failed":1,"failures":["[sig-autoscaling] [VPA] [full-vpa] [v1beta2] Pods under VPA have cpu requests growing with usage"]}
{"msg":"Test Suite completed","total":3,"completed":2,"skipped":38,"failed":1,"failures":["[sig-autoscaling] [VPA] [full-vpa] [v1beta2] Pods under VPA have cpu requests growing with usage"]}
{"msg":"Test Suite starting","total":5,"completed":0,"skipped":0,"failed":0}
{"msg":"FAILED [sig-autoscaling] [VPA] [full-vpa] [v1] Pods under VPA with default recommender explicitly configured have cpu requests growing with usage","total":5,"completed":0,"skipped":6,"failed":1,"failures":["[sig-autoscaling] [VPA] [full-vpa] [v1] Pods under VPA with default recommender explicitly configured have cpu requests growing with usage"]}
{"msg":"FAILED [sig-autoscaling] [VPA] [full-vpa] [v1] OOMing pods under VPA have memory requests growing with OOMs","total":5,"completed":0,"skipped":13,"failed":2,"failures":["[sig-autoscaling] [VPA] [full-vpa] [v1] Pods under VPA with default recommender explicitly configured have cpu requests growing with usage","[sig-autoscaling] [VPA] [full-vpa] [v1] OOMing pods under VPA have memory requests growing with OOMs"]}
{"msg":"PASSED [sig-autoscaling] [VPA] [full-vpa] [v1] Pods under VPA have memory requests growing with usage","total":5,"completed":1,"skipped":18,"failed":2,"failures":["[sig-autoscaling] [VPA] [full-vpa] [v1] Pods under VPA with default recommender explicitly configured have cpu requests growing with usage","[sig-autoscaling] [VPA] [full-vpa] [v1] OOMing pods under VPA have memory requests growing with OOMs"]}
{"msg":"PASSED [sig-autoscaling] [VPA] [full-vpa] [v1] Pods under VPA with non-recognized recommender explicitly configured deployment not updated by non-recognized recommender","total":5,"completed":2,"skipped":19,"failed":2,"failures":["[sig-autoscaling] [VPA] [full-vpa] [v1] Pods under VPA with default recommender explicitly configured have cpu requests growing with usage","[sig-autoscaling] [VPA] [full-vpa] [v1] OOMing pods under VPA have memory requests growing with OOMs"]}

[2]

ESC[91mESC[1m<E2><80><A2> Failure [959.135 seconds]ESC[0m
[sig-autoscaling] [VPA] [full-vpa] [v1beta2] Pods under VPA
ESC[90m/usr/local/google/home/jbartosik/go/src/k8s.io/autoscaler/vertical-pod-autoscaler/e2e/v1beta2/common.go:72ESC[0m
  ESC[91mESC[1mhave cpu requests growing with usage [It]ESC[0m
  ESC[90m/usr/local/google/home/jbartosik/go/src/k8s.io/autoscaler/vertical-pod-autoscaler/e2e/v1beta2/full_vpa.go:94ESC[0m

  ESC[91mOct 25 22:39:54.558: Unexpected error:
      <*errors.errorString | 0xc0005476a0>: {
          s: "error waiting for cpu request in range of ({{500 -3} {<nil>} 500m DecimalSI},{{1000 -3} {<nil>}  DecimalSI}) for pods: {TypeMeta:{Kind: APIVersion:} LabelSelector:name=hamster FieldSelector: Watch:false AllowWatchBookmarks:false ResourceVersion: ResourceVersionMatch: TimeoutSeconds:<nil> Limit:0 Continue:}",
      }
      error waiting for cpu request in range of ({{500 -3} {<nil>} 500m DecimalSI},{{1000 -3} {<nil>}  DecimalSI}) for pods: {TypeMeta:{Kind: APIVersion:} LabelSelector:name=hamster FieldSelector: Watch:false AllowWatchBookmarks:false ResourceVersion: ResourceVersionMatch: TimeoutSeconds:<nil> Limit:0 Continue:}
  occurredESC[0m

[sig-autoscaling] [VPA] [full-vpa] [v1] Pods under VPA with default recommender explicitly configured
ESC[90m/usr/local/google/home/jbartosik/go/src/k8s.io/autoscaler/vertical-pod-autoscaler/e2e/v1/common.go:72ESC[0m
  ESC[91mESC[1mhave cpu requests growing with usage [It]ESC[0m
  ESC[90m/usr/local/google/home/jbartosik/go/src/k8s.io/autoscaler/vertical-pod-autoscaler/e2e/v1/full_vpa.go:172ESC[0m

  ESC[91mOct 25 22:55:58.782: Unexpected error:
      <*errors.errorString | 0xc0007799a0>: {
          s: "error waiting for cpu request in range of ({{500 -3} {<nil>} 500m DecimalSI},{{1000 -3} {<nil>}  DecimalSI}) for pods: {TypeMeta:{Kind: APIVersion:} LabelSelector:name=hamster FieldSelector: Watch:false AllowWatchBookmarks:false ResourceVersion: ResourceVersionMatch: TimeoutSeconds:<nil> Limit:0 Continue:}",
      }
      error waiting for cpu request in range of ({{500 -3} {<nil>} 500m DecimalSI},{{1000 -3} {<nil>}  DecimalSI}) for pods: {TypeMeta:{Kind: APIVersion:} LabelSelector:name=hamster FieldSelector: Watch:false AllowWatchBookmarks:false ResourceVersion: ResourceVersionMatch: TimeoutSeconds:<nil> Limit:0 Continue:}
  occurredESC[0m

[sig-autoscaling] [VPA] [full-vpa] [v1] OOMing pods under VPA
ESC[90m/usr/local/google/home/jbartosik/go/src/k8s.io/autoscaler/vertical-pod-autoscaler/e2e/v1/common.go:72ESC[0m
  ESC[91mESC[1mhave memory requests growing with OOMs [It]ESC[0m
  ESC[90m/usr/local/google/home/jbartosik/go/src/k8s.io/autoscaler/vertical-pod-autoscaler/e2e/v1/full_vpa.go:283ESC[0m

  ESC[91mOct 25 23:04:47.713: Unexpected error:
      <*errors.errorString | 0xc000cb8480>: {
          s: "error waiting for memory request in range of ({{1468006400 0} {<nil>}  BinarySI},{{10485760000 0} {<nil>}  BinarySI}) for pods: {TypeMeta:{Kind: APIVersion:} LabelSelector:name=hamster FieldSelector:status.phase!=Failed,status.phase!=Succeeded Watch:false AllowWatchBookmarks:false ResourceVersion: ResourceVersionMatch: TimeoutSeconds:<nil> Limit:0 Continue:}",
      }
      error waiting for memory request in range of ({{1468006400 0} {<nil>}  BinarySI},{{10485760000 0} {<nil>}  BinarySI}) for pods: {TypeMeta:{Kind: APIVersion:} LabelSelector:name=hamster FieldSelector:status.phase!=Failed,status.phase!=Succeeded Watch:false AllowWatchBookmarks:false ResourceVersion: ResourceVersionMatch: TimeoutSeconds:<nil> Limit:0 Continue:}
  occurredESC[0m

  /usr/local/google/home/jbartosik/go/src/k8s.io/autoscaler/vertical-pod-autoscaler/e2e/v1/full_vpa.go:291
ESC[90m------------------------------ESC[0m
{"msg":"FAILED [sig-autoscaling] [VPA] [full-vpa] [v1] OOMing pods under VPA have memory requests growing with OOMs","total":5,"completed":0,"skipped":13,"failed":2,"failures":["[sig-autoscaling] [VPA] [full-vpa] [v1] Pods under VPA with default recommender explicitly configured have cpu requests growing with usage","[sig-autoscaling] [VPA] [full-vpa] [v1] OOMing pods under VPA have memory requests growing with OOMs"]}

jbartosik · 2021-10-25T22:00:13Z

@wangchen615

To speed things up. There is one guess I have. If you're trying to add Pods under VPA with default recommender explicitly configured have cpu requests growing with usage and Pods under VPA with non-recognized recommender explicitly configured deployment not updated by non-recognized recommender tests for API v1beta2 then I don't think they will work. This PR only extends v1 API.

If this is the problem then:

I think we shouldn't change v1beta2 API but I'll check
So please let me know so I know that I should check,
If I'm correct about this then we can submit the PR with tests only for the v1 API
- So for now please don't add those tests for v1beta2 API.

After I check

If I'm correct about us not needing to change v1beta2 API then I'll review new changes and this PR should be ready to submit soon.
If I'm wrong then we'll need to change v1beta2 API too.

If this is not the problem you're facing then please give me error messages you see and how you get them and I'll do my best to help.

Thank you.

wangchen615 · 2021-11-01T05:09:34Z

@jbartosik , thanks so much for your advice. I did not add any tests to v1beta2. That's why I was confusing.

Your comment on the cluster size helped a lot and it seemed the cluster I was using had too small nodes to test VPA and all tests eventually failed. Please see the full log here.

According to your comment, I created another GKE cluster (1.20.10-gke.301) with one large node (e2-standard-16), and then all tests passed.

grep \"msg\" alt_recommender_e2e.log
{"msg":"Test Suite starting","total":3,"completed":0,"skipped":0,"failed":0}
{"msg":"PASSED [sig-autoscaling] [VPA] [full-vpa] [v1beta2] Pods under VPA have memory requests growing with usage","total":3,"completed":1,"skipped":4,"failed":0}
{"msg":"PASSED [sig-autoscaling] [VPA] [full-vpa] [v1beta2] Pods under VPA have cpu requests growing with usage","total":3,"completed":2,"skipped":23,"failed":0}
{"msg":"PASSED [sig-autoscaling] [VPA] [full-vpa] [v1beta2] OOMing pods under VPA have memory requests growing with OOMs","total":3,"completed":3,"skipped":35,"failed":0}
{"msg":"Test Suite completed","total":3,"completed":3,"skipped":38,"failed":0}
{"msg":"Test Suite starting","total":5,"completed":0,"skipped":0,"failed":0}
{"msg":"PASSED [sig-autoscaling] [VPA] [full-vpa] [v1] Pods under VPA have cpu requests growing with usage","total":5,"completed":1,"skipped":13,"failed":0}
{"msg":"PASSED [sig-autoscaling] [VPA] [full-vpa] [v1] Pods under VPA with default recommender explicitly configured have cpu requests growing with usage","total":5,"completed":2,"skipped":26,"failed":0}
{"msg":"PASSED [sig-autoscaling] [VPA] [full-vpa] [v1] Pods under VPA with non-recognized recommender explicitly configured deployment not updated by non-recognized recommender","total":5,"completed":3,"skipped":27,"failed":0}
{"msg":"PASSED [sig-autoscaling] [VPA] [full-vpa] [v1] OOMing pods under VPA have memory requests growing with OOMs","total":5,"completed":4,"skipped":28,"failed":0}
{"msg":"PASSED [sig-autoscaling] [VPA] [full-vpa] [v1] Pods under VPA have memory requests growing with usage","total":5,"completed":5,"skipped":36,"failed":0}
{"msg":"Test Suite completed","total":5,"completed":5,"skipped":39,"failed":0}

More details of full log message can also be seen here.

I think we are ready to merge. Please take a final look and see if there are some last changes we want to make. Thanks.

@wangchen615

To speed things up. There is one guess I have. If you're trying to add Pods under VPA with default recommender explicitly configured have cpu requests growing with usage and Pods under VPA with non-recognized recommender explicitly configured deployment not updated by non-recognized recommender tests for API v1beta2 then I don't think they will work. This PR only extends v1 API.

If this is the problem then:

I think we shouldn't change v1beta2 API but I'll check

So please let me know so I know that I should check,

If I'm correct about this then we can submit the PR with tests only for the v1 API

So for now please don't add those tests for v1beta2 API.

After I check

If I'm correct about us not needing to change v1beta2 API then I'll review new changes and this PR should be ready to submit soon.

If I'm wrong then we'll need to change v1beta2 API too.

If this is not the problem you're facing then please give me error messages you see and how you get them and I'll do my best to help.

Thank you.

jbartosik

Looks good to me. Please squash changes

linux-foundation-easycla · 2021-11-15T21:09:26Z

The committers are authorized under a signed CLA.

✅ Chen Wang (c4bd38b)

wangchen615 · 2021-11-15T21:18:56Z

Looks good to me. Please squash changes

@jbartosik , thanks for the review. Yes, I did the rebase and squash.
The EasyCLA issue was due to this issue: kubernetes/org#3068
It should be automatically merged if you do /lgtm and /approve

jbartosik

/lgtm
/approve

k8s-ci-robot · 2021-11-16T08:42:34Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jbartosik, wangchen615

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~vertical-pod-autoscaler/OWNERS~~ [jbartosik]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 10, 2021

k8s-ci-robot requested review from jbartosik and krzysied June 10, 2021 06:34

kgolab reviewed Jun 18, 2021

View reviewed changes

vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go Outdated Show resolved Hide resolved

kgolab reviewed Jun 18, 2021

View reviewed changes

vertical-pod-autoscaler/examples/hamster-default-recommender.yaml Outdated Show resolved Hide resolved

kgolab reviewed Jun 18, 2021

View reviewed changes

vertical-pod-autoscaler/pkg/recommender/input/cluster_feeder.go Outdated Show resolved Hide resolved

kgolab reviewed Jun 18, 2021

View reviewed changes

Jeffwan reviewed Jun 18, 2021

View reviewed changes

bskiba reviewed Jun 21, 2021

View reviewed changes

vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go Outdated Show resolved Hide resolved

vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go Outdated Show resolved Hide resolved

bskiba reviewed Jun 25, 2021

View reviewed changes

jbartosik added the area/vertical-pod-autoscaler label Jul 22, 2021

kallurbsk mentioned this pull request Aug 4, 2021

Addressing drawbacks of existing VPA recommender gardener/autoscaler#93

Closed

10 tasks

jbartosik reviewed Aug 30, 2021

View reviewed changes

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 20, 2021

jbartosik requested changes Oct 5, 2021

View reviewed changes

jbartosik reviewed Oct 6, 2021

View reviewed changes

jbartosik mentioned this pull request Nov 10, 2021

[VPA] New release? #4404

Closed

jbartosik approved these changes Nov 15, 2021

View reviewed changes

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 15, 2021

Added changes to support alternative recommender

c4bd38b

wangchen615 mentioned this pull request Nov 15, 2021

EasyCLA Feedback & Issues kubernetes/org#3068

Closed

jbartosik approved these changes Nov 16, 2021

View reviewed changes

k8s-ci-robot assigned jbartosik Nov 16, 2021

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 16, 2021

k8s-ci-robot merged commit 608d900 into kubernetes:master Nov 16, 2021

wangchen615 mentioned this pull request Aug 16, 2022

REQUEST: New membership for wangchen615 kubernetes/org#3616

Closed

9 tasks

kgolab mentioned this pull request Oct 20, 2022

REQUEST: New membership for kgolab kubernetes/org#3779

Closed

9 tasks

Added changes to support alternative recommender #4131

Added changes to support alternative recommender #4131

Conversation

wangchen615 commented Jun 10, 2021

bskiba commented Jun 10, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kgolab commented Jun 18, 2021

wangchen615 commented Jun 22, 2021

bskiba left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Jeffwan commented Jul 1, 2021

wangchen615 commented Aug 30, 2021 • edited Loading

jbartosik commented Aug 30, 2021

jbartosik left a comment

Choose a reason for hiding this comment

wangchen615 commented Aug 31, 2021

jbartosik commented Aug 31, 2021

wangchen615 commented Sep 1, 2021 • edited Loading

wangchen615 commented Sep 6, 2021 • edited Loading

kgolab commented Sep 7, 2021

wangchen615 commented Sep 7, 2021

wangchen615 commented Sep 20, 2021

jbartosik left a comment • edited Loading

Choose a reason for hiding this comment

jbartosik left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wangchen615 commented Oct 6, 2021

jbartosik commented Oct 7, 2021

wangchen615 commented Oct 25, 2021

jbartosik commented Oct 25, 2021

jbartosik commented Oct 25, 2021

wangchen615 commented Nov 1, 2021 • edited Loading

jbartosik left a comment

Choose a reason for hiding this comment

linux-foundation-easycla bot commented Nov 15, 2021 • edited Loading

wangchen615 commented Nov 15, 2021

jbartosik left a comment

Choose a reason for hiding this comment

k8s-ci-robot commented Nov 16, 2021

wangchen615 commented Aug 30, 2021 •

edited

Loading

wangchen615 commented Sep 1, 2021 •

edited

Loading

wangchen615 commented Sep 6, 2021 •

edited

Loading

jbartosik left a comment •

edited

Loading

wangchen615 commented Nov 1, 2021 •

edited

Loading

linux-foundation-easycla bot commented Nov 15, 2021 •

edited

Loading