Skip to content

Commit b5a054e

Browse files
sushrkdependabot[bot]xdu31haoucorsenthil
authored
Merge branch eni-cleanup into master
* Call DisassociateTrunkInterface before deleting branch ENI (aws#372) * Call DisassociateTrunkInterface before deleting branch ENI * feat: Centralize leaked ENI cleanup (aws#374) * feat: centralized eni cleanup * Merge master into eni-cleanup (aws#385) * fix: paginate DescribeNetworkInterfaces with deep filters (aws#375) * fix: paginate DescribeNetworkInterfaces with deep filters * update metrics and address review comments * minor updates to address comments * Bump github.com/aws/aws-sdk-go from 1.49.13 to 1.50.29 (aws#380) Bumps [github.com/aws/aws-sdk-go](https://github.com/aws/aws-sdk-go) from 1.49.13 to 1.50.29. - [Release notes](https://github.com/aws/aws-sdk-go/releases) - [Commits](aws/aws-sdk-go@v1.49.13...v1.50.29) --- updated-dependencies: - dependency-name: github.com/aws/aws-sdk-go dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump k8s.io/client-go from 0.29.1 to 0.29.2 (aws#377) Bumps [k8s.io/client-go](https://github.com/kubernetes/client-go) from 0.29.1 to 0.29.2. - [Changelog](https://github.com/kubernetes/client-go/blob/master/CHANGELOG.md) - [Commits](kubernetes/client-go@v0.29.1...v0.29.2) --- updated-dependencies: - dependency-name: k8s.io/client-go dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump github.com/prometheus/common from 0.46.0 to 0.49.0 (aws#378) Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.46.0 to 0.49.0. - [Release notes](https://github.com/prometheus/common/releases) - [Commits](prometheus/common@v0.46.0...v0.49.0) --- updated-dependencies: - dependency-name: github.com/prometheus/common dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Repo controlled build go version (aws#381) * update golang version (aws#383) --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Jason Du <[email protected]> * fix:update cluster tag name in CNINode (aws#386) * fix:add node OS label in CNINode, retry get CNINode with backoff * update protobuf to 1.33.0 (aws#387) * add CNINode integration tests (aws#391) * use DescribeNetworkInterfaces with deep filters * add integration test to validate ec2 permissions * remove DisassociateAllBranchENIs as it is not useful (aws#400) * remove DisassociateAllBranchENIs as it is not useful * skip deletion success log for NotFound ENI * fix govulncheck * Merge master branch into eni-cleanup (aws#416) * fix: paginate DescribeNetworkInterfaces with deep filters (aws#375) * fix: paginate DescribeNetworkInterfaces with deep filters * update metrics and address review comments * minor updates to address comments * Bump github.com/aws/aws-sdk-go from 1.49.13 to 1.50.29 (aws#380) Bumps [github.com/aws/aws-sdk-go](https://github.com/aws/aws-sdk-go) from 1.49.13 to 1.50.29. - [Release notes](https://github.com/aws/aws-sdk-go/releases) - [Commits](aws/aws-sdk-go@v1.49.13...v1.50.29) --- updated-dependencies: - dependency-name: github.com/aws/aws-sdk-go dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump k8s.io/client-go from 0.29.1 to 0.29.2 (aws#377) Bumps [k8s.io/client-go](https://github.com/kubernetes/client-go) from 0.29.1 to 0.29.2. - [Changelog](https://github.com/kubernetes/client-go/blob/master/CHANGELOG.md) - [Commits](kubernetes/client-go@v0.29.1...v0.29.2) --- updated-dependencies: - dependency-name: k8s.io/client-go dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump github.com/prometheus/common from 0.46.0 to 0.49.0 (aws#378) Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.46.0 to 0.49.0. - [Release notes](https://github.com/prometheus/common/releases) - [Commits](prometheus/common@v0.46.0...v0.49.0) --- updated-dependencies: - dependency-name: github.com/prometheus/common dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Repo controlled build go version (aws#381) * update golang version (aws#383) * update protobuf to 1.33.0 (aws#387) * pin envtest version due to an upstream bug (aws#390) * Bump k8s.io/client-go from 0.29.2 to 0.29.3 (aws#392) Bumps [k8s.io/client-go](https://github.com/kubernetes/client-go) from 0.29.2 to 0.29.3. - [Changelog](https://github.com/kubernetes/client-go/blob/master/CHANGELOG.md) - [Commits](kubernetes/client-go@v0.29.2...v0.29.3) --- updated-dependencies: - dependency-name: k8s.io/client-go dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump github.com/aws/amazon-vpc-cni-k8s from 1.16.0 to 1.17.1 (aws#393) Bumps [github.com/aws/amazon-vpc-cni-k8s](https://github.com/aws/amazon-vpc-cni-k8s) from 1.16.0 to 1.17.1. - [Release notes](https://github.com/aws/amazon-vpc-cni-k8s/releases) - [Changelog](https://github.com/aws/amazon-vpc-cni-k8s/blob/master/CHANGELOG.md) - [Commits](aws/amazon-vpc-cni-k8s@v1.16.0...v1.17.1) --- updated-dependencies: - dependency-name: github.com/aws/amazon-vpc-cni-k8s dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump github.com/prometheus/common from 0.49.0 to 0.51.1 (aws#395) Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.49.0 to 0.51.1. - [Release notes](https://github.com/prometheus/common/releases) - [Commits](prometheus/common@v0.49.0...v0.51.1) --- updated-dependencies: - dependency-name: github.com/prometheus/common dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump github.com/aws/aws-sdk-go from 1.50.29 to 1.51.12 (aws#397) Bumps [github.com/aws/aws-sdk-go](https://github.com/aws/aws-sdk-go) from 1.50.29 to 1.51.12. - [Release notes](https://github.com/aws/aws-sdk-go/releases) - [Commits](aws/aws-sdk-go@v1.50.29...v1.51.12) --- updated-dependencies: - dependency-name: github.com/aws/aws-sdk-go dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * add github action to run gosec static analysis (aws#398) * add github action to run gosec static analysis * install gosec * update golang and dependency to fix CVE (aws#401) * revert pagination and call DescribeNetworkInterfaces with vpcID or subnetID filter * Revert "fix: paginate DescribeNetworkInterfaces with deep filters (aws#375)" This reverts commit b5699de. * call DescribeNetworkInterfaces with vpcID or subnetID filter * update EC2 supported instance types (aws#402) * remove global exclusion for G108,G114 and add nosec in code (aws#404) * Update controller_auth_proxy_patch.yaml (aws#405) Update the reference from gcr.io to registry.k8s.io > kube-rbac-proxy is moving to registry.k8s.io/kubebuilder/kube-rbac-proxy (from gcr.io/kubebuilder/kube-rbac-proxy) because GCR is being sunset. We need to update these references. * Fix log which causes panic (aws#407) * Fix log which causes panic * Consistent key name * consistent naming * run go mod tidy --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Jason Du <[email protected]> Co-authored-by: Hao Zhou <[email protected]> Co-authored-by: Senthil Kumaran <[email protected]> Co-authored-by: Garvin Pang <[email protected]> --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Jason Du <[email protected]> Co-authored-by: Hao Zhou <[email protected]> Co-authored-by: Senthil Kumaran <[email protected]> Co-authored-by: Garvin Pang <[email protected]>
1 parent c80fd41 commit b5a054e

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

49 files changed

+2097
-610
lines changed

PROJECT

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ resources:
1919
version: v1beta1
2020
- api:
2121
crdVersion: v1
22+
controller: true
2223
domain: k8s.aws
2324
group: vpcresources
2425
kind: CNINode

apis/vpcresources/v1alpha1/cninode_types.go

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,8 @@ type Feature struct {
3535
// CNINodeSpec defines the desired state of CNINode
3636
type CNINodeSpec struct {
3737
Features []Feature `json:"features,omitempty"`
38+
// Additional tag key/value added to all network interfaces provisioned by the vpc-resource-controller and VPC-CNI
39+
Tags map[string]string `json:"tags,omitempty"`
3840
}
3941

4042
// CNINodeStatus defines the managed VPC resources.

apis/vpcresources/v1alpha1/zz_generated.deepcopy.go

Lines changed: 7 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

config/crd/bases/vpcresources.k8s.aws_cninodes.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,12 @@ spec:
5656
type: string
5757
type: object
5858
type: array
59+
tags:
60+
additionalProperties:
61+
type: string
62+
description: Additional tag key/value added to all network interfaces
63+
provisioned by the vpc-resource-controller and VPC-CNI
64+
type: object
5965
type: object
6066
status:
6167
description: CNINodeStatus defines the managed VPC resources.

config/rbac/role.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,8 @@ rules:
6161
- create
6262
- get
6363
- list
64+
- patch
65+
- update
6466
- watch
6567
- apiGroups:
6668
- vpcresources.k8s.aws

controllers/core/node_controller.go

Lines changed: 2 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ import (
2020

2121
"github.com/aws/amazon-vpc-resource-controller-k8s/apis/vpcresources/v1alpha1"
2222
"github.com/aws/amazon-vpc-resource-controller-k8s/pkg/condition"
23+
"github.com/aws/amazon-vpc-resource-controller-k8s/pkg/config"
2324
rcHealthz "github.com/aws/amazon-vpc-resource-controller-k8s/pkg/healthz"
2425
"github.com/aws/amazon-vpc-resource-controller-k8s/pkg/k8s"
2526
"github.com/aws/amazon-vpc-resource-controller-k8s/pkg/node/manager"
@@ -36,15 +37,6 @@ import (
3637
"sigs.k8s.io/controller-runtime/pkg/healthz"
3738
)
3839

39-
// MaxNodeConcurrentReconciles is the number of go routines that can invoke
40-
// Reconcile in parallel. Since Node Reconciler, performs local operation
41-
// on cache only a single go routine should be sufficient. Using more than
42-
// one routines to help high rate churn and larger nodes groups restarting
43-
// when the controller has to be restarted for various reasons.
44-
const (
45-
MaxNodeConcurrentReconciles = 10
46-
)
47-
4840
// NodeReconciler reconciles a Node object
4941
type NodeReconciler struct {
5042
client.Client
@@ -117,7 +109,7 @@ func (r *NodeReconciler) SetupWithManager(mgr ctrl.Manager, healthzHandler *rcHe
117109

118110
return ctrl.NewControllerManagedBy(mgr).
119111
For(&corev1.Node{}).
120-
WithOptions(controller.Options{MaxConcurrentReconciles: MaxNodeConcurrentReconciles}).
112+
WithOptions(controller.Options{MaxConcurrentReconciles: config.MaxNodeConcurrentReconciles}).
121113
Owns(&v1alpha1.CNINode{}).
122114
Complete(r)
123115
}
Lines changed: 252 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,252 @@
1+
// Copyright Amazon.com Inc. or its affiliates. All Rights Reserved.
2+
//
3+
// Licensed under the Apache License, Version 2.0 (the "License"). You may
4+
// not use this file except in compliance with the License. A copy of the
5+
// License is located at
6+
//
7+
// http://aws.amazon.com/apache2.0/
8+
//
9+
// or in the "license" file accompanying this file. This file is distributed
10+
// on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
11+
// express or implied. See the License for the specific language governing
12+
// permissions and limitations under the License.
13+
14+
package crds
15+
16+
import (
17+
"context"
18+
"time"
19+
20+
"github.com/aws/amazon-vpc-resource-controller-k8s/apis/vpcresources/v1alpha1"
21+
ec2API "github.com/aws/amazon-vpc-resource-controller-k8s/pkg/aws/ec2/api"
22+
"github.com/aws/amazon-vpc-resource-controller-k8s/pkg/aws/ec2/api/cleanup"
23+
"github.com/aws/amazon-vpc-resource-controller-k8s/pkg/config"
24+
"github.com/aws/amazon-vpc-resource-controller-k8s/pkg/k8s"
25+
"github.com/aws/amazon-vpc-resource-controller-k8s/pkg/utils"
26+
"github.com/go-logr/logr"
27+
"github.com/prometheus/client_golang/prometheus"
28+
v1 "k8s.io/api/core/v1"
29+
"k8s.io/apimachinery/pkg/api/errors"
30+
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
31+
"k8s.io/apimachinery/pkg/runtime"
32+
"k8s.io/apimachinery/pkg/types"
33+
"k8s.io/apimachinery/pkg/util/wait"
34+
"k8s.io/client-go/util/retry"
35+
ctrl "sigs.k8s.io/controller-runtime"
36+
"sigs.k8s.io/controller-runtime/pkg/client"
37+
"sigs.k8s.io/controller-runtime/pkg/controller"
38+
"sigs.k8s.io/controller-runtime/pkg/metrics"
39+
)
40+
41+
var (
42+
prometheusRegistered = false
43+
recreateCNINodeCallCount = prometheus.NewCounter(
44+
prometheus.CounterOpts{
45+
Name: "recreate_cniNode_call_count",
46+
Help: "The number of requests made by controller to recreate CNINode when node exists",
47+
},
48+
)
49+
recreateCNINodeErrCount = prometheus.NewCounter(
50+
prometheus.CounterOpts{
51+
Name: "recreate_cniNode_err_count",
52+
Help: "The number of requests that failed when controller tried to recreate the CNINode",
53+
},
54+
)
55+
)
56+
57+
func prometheusRegister() {
58+
prometheusRegistered = true
59+
60+
metrics.Registry.MustRegister(
61+
recreateCNINodeCallCount,
62+
recreateCNINodeErrCount)
63+
64+
prometheusRegistered = true
65+
}
66+
67+
// CNINodeReconciler reconciles a CNINode object
68+
type CNINodeReconciler struct {
69+
client.Client
70+
Scheme *runtime.Scheme
71+
Context context.Context
72+
Log logr.Logger
73+
EC2Wrapper ec2API.EC2Wrapper
74+
K8sAPI k8s.K8sWrapper
75+
ClusterName string
76+
VPCID string
77+
FinalizerManager k8s.FinalizerManager
78+
}
79+
80+
//+kubebuilder:rbac:groups=vpcresources.k8s.aws,resources=cninodes,verbs=get;list;watch;create;update;patch;
81+
82+
// Reconcile handles CNINode create/update/delete events
83+
// Reconciler will add the finalizer and cluster name tag if it does not exist and finalize on CNINode on deletion to clean up leaked resource on node
84+
func (r *CNINodeReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
85+
cniNode := &v1alpha1.CNINode{}
86+
if err := r.Client.Get(ctx, req.NamespacedName, cniNode); err != nil {
87+
if errors.IsNotFound(err) {
88+
r.Log.Info("CNINode is deleted", "CNINode", req.NamespacedName)
89+
}
90+
// Ignore not found error
91+
return ctrl.Result{}, client.IgnoreNotFound(err)
92+
}
93+
94+
nodeFound := true
95+
node := &v1.Node{}
96+
if err := r.Client.Get(ctx, req.NamespacedName, node); err != nil {
97+
if errors.IsNotFound(err) {
98+
nodeFound = false
99+
} else {
100+
r.Log.Error(err, "failed to get the node object in CNINode reconciliation, will retry")
101+
// Requeue request so it can be retried
102+
return ctrl.Result{}, err
103+
}
104+
}
105+
106+
if cniNode.GetDeletionTimestamp().IsZero() {
107+
shouldPatch := false
108+
cniNodeCopy := cniNode.DeepCopy()
109+
// Add cluster name tag if it does not exist
110+
val, ok := cniNode.Spec.Tags[config.CNINodeClusterNameKey]
111+
if !ok || val != r.ClusterName {
112+
if len(cniNodeCopy.Spec.Tags) != 0 {
113+
cniNodeCopy.Spec.Tags[config.CNINodeClusterNameKey] = r.ClusterName
114+
} else {
115+
cniNodeCopy.Spec.Tags = map[string]string{
116+
config.CNINodeClusterNameKey: r.ClusterName,
117+
}
118+
}
119+
shouldPatch = true
120+
}
121+
// if node exists, get & add OS label if it does not exist on CNINode
122+
if nodeFound {
123+
nodeLabelOS := node.ObjectMeta.Labels[config.NodeLabelOS]
124+
val, ok = cniNode.ObjectMeta.Labels[config.NodeLabelOS]
125+
if !ok || val != nodeLabelOS {
126+
if len(cniNodeCopy.ObjectMeta.Labels) != 0 {
127+
cniNodeCopy.ObjectMeta.Labels[config.NodeLabelOS] = nodeLabelOS
128+
} else {
129+
cniNodeCopy.ObjectMeta.Labels = map[string]string{
130+
config.NodeLabelOS: nodeLabelOS,
131+
}
132+
}
133+
shouldPatch = true
134+
}
135+
}
136+
137+
if shouldPatch {
138+
r.Log.Info("patching CNINode to add required fields Tags and Labels", "cninode", cniNode.Name)
139+
return ctrl.Result{}, r.Client.Patch(ctx, cniNodeCopy, client.MergeFromWithOptions(cniNode, client.MergeFromWithOptimisticLock{}))
140+
}
141+
142+
// Add finalizer if it does not exist
143+
if err := r.FinalizerManager.AddFinalizers(ctx, cniNode, config.NodeTerminationFinalizer); err != nil {
144+
r.Log.Error(err, "failed to add finalizer on CNINode, will retry", "cniNode", cniNode.Name, "finalizer", config.NodeTerminationFinalizer)
145+
return ctrl.Result{}, err
146+
}
147+
return ctrl.Result{}, nil
148+
149+
} else { // CNINode is marked for deletion
150+
if !nodeFound {
151+
// node is also deleted, proceed with running the cleanup routine and remove the finalizer
152+
153+
// run cleanup for Linux nodes only
154+
if val, ok := cniNode.ObjectMeta.Labels[config.NodeLabelOS]; ok && val == config.OSLinux {
155+
r.Log.Info("running the finalizer routine on cniNode", "cniNode", cniNode.Name)
156+
cleaner := &cleanup.NodeTerminationCleaner{
157+
NodeName: cniNode.Name,
158+
}
159+
cleaner.ENICleaner = &cleanup.ENICleaner{
160+
EC2Wrapper: r.EC2Wrapper,
161+
Manager: cleaner,
162+
VPCID: r.VPCID,
163+
Log: ctrl.Log.WithName("eniCleaner").WithName("node"),
164+
}
165+
166+
if err := cleaner.DeleteLeakedResources(); err != nil {
167+
r.Log.Error(err, "failed to cleanup resources during node termination, request will be requeued")
168+
// Return err if failed to delete leaked ENIs on node so it can be retried
169+
return ctrl.Result{}, err
170+
}
171+
}
172+
173+
if err := r.FinalizerManager.RemoveFinalizers(ctx, cniNode, config.NodeTerminationFinalizer); err != nil {
174+
r.Log.Error(err, "failed to remove finalizer on CNINode, will retry", "cniNode", cniNode.Name, "finalizer", config.NodeTerminationFinalizer)
175+
return ctrl.Result{}, err
176+
}
177+
return ctrl.Result{}, nil
178+
} else {
179+
// node exists, do not run the cleanup routine(periodic cleanup routine will anyway delete leaked ENIs), remove the finalizer
180+
// to proceed with object deletion, and recreate similar object
181+
182+
// Create a copy without deletion timestamp for creation
183+
newCNINode := &v1alpha1.CNINode{
184+
ObjectMeta: metav1.ObjectMeta{
185+
Name: cniNode.Name,
186+
Namespace: "",
187+
OwnerReferences: cniNode.OwnerReferences,
188+
// TODO: should we include finalizers at object creation or let controller patch it on Create/Update event?
189+
Finalizers: cniNode.Finalizers,
190+
},
191+
Spec: cniNode.Spec,
192+
}
193+
194+
if err := r.FinalizerManager.RemoveFinalizers(ctx, cniNode, config.NodeTerminationFinalizer); err != nil {
195+
r.Log.Error(err, "failed to remove finalizer on CNINode on node deletion, will retry")
196+
return ctrl.Result{}, err
197+
}
198+
// wait till CNINode is deleted before recreation as the new object will be created with same name to avoid "object already exists" error
199+
if err := r.waitTillCNINodeDeleted(k8s.NamespacedName(newCNINode)); err != nil {
200+
// raise event if CNINode could not be deleted after removing the finalizer
201+
r.K8sAPI.BroadcastEvent(cniNode, utils.CNINodeDeleteFailed, "CNINode deletion failed and object could not be recreated by the vpc-resource-controller, will retry",
202+
v1.EventTypeWarning)
203+
// requeue here to check if CNINode deletion is successful and retry CNINode deletion if node exists
204+
return ctrl.Result{}, err
205+
}
206+
207+
r.Log.Info("creating CNINode after it has been deleted as node still exists", "cniNode", newCNINode.Name)
208+
recreateCNINodeCallCount.Inc()
209+
if err := r.createCNINodeFromObj(ctx, newCNINode); err != nil {
210+
recreateCNINodeErrCount.Inc()
211+
// raise event on node publish warning that CNINode is deleted and could not be recreated by controller
212+
utils.SendNodeEventWithNodeName(r.K8sAPI, node.Name, utils.CNINodeCreateFailed,
213+
"CNINode was deleted and failed to be recreated by the vpc-resource-controller", v1.EventTypeWarning, r.Log)
214+
// return nil as deleted and we cannot recreate the object now
215+
return ctrl.Result{}, nil
216+
}
217+
r.Log.Info("successfully recreated CNINode", "cniNode", newCNINode.Name)
218+
}
219+
}
220+
return ctrl.Result{}, nil
221+
}
222+
223+
// SetupWithManager sets up the controller with the Manager.
224+
func (r *CNINodeReconciler) SetupWithManager(mgr ctrl.Manager) error {
225+
if !prometheusRegistered {
226+
prometheusRegister()
227+
}
228+
return ctrl.NewControllerManagedBy(mgr).
229+
For(&v1alpha1.CNINode{}).
230+
WithOptions(controller.Options{MaxConcurrentReconciles: config.MaxNodeConcurrentReconciles}).
231+
Complete(r)
232+
}
233+
234+
// waitTillCNINodeDeleted waits for CNINode to be deleted with timeout and returns error
235+
func (r *CNINodeReconciler) waitTillCNINodeDeleted(nameSpacedCNINode types.NamespacedName) error {
236+
oldCNINode := &v1alpha1.CNINode{}
237+
238+
return wait.PollUntilContextTimeout(context.TODO(), 500*time.Millisecond, time.Second*3, true, func(ctx context.Context) (bool, error) {
239+
if err := r.Client.Get(ctx, nameSpacedCNINode, oldCNINode); err != nil && errors.IsNotFound(err) {
240+
return true, nil
241+
}
242+
return false, nil
243+
})
244+
}
245+
246+
// createCNINodeFromObj will create CNINode with backoff and returns error if CNINode is not recreated
247+
func (r *CNINodeReconciler) createCNINodeFromObj(ctx context.Context, newCNINode client.Object) error {
248+
return retry.OnError(retry.DefaultBackoff, func(error) bool { return true },
249+
func() error {
250+
return r.Client.Create(ctx, newCNINode)
251+
})
252+
}

0 commit comments

Comments
 (0)