Skip to content

Merged master into release-1.19 with master versions for all conflicts #3256

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 63 commits into from
Apr 15, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
310c784
Update to Changelog, config and scripts. (#3095) (#3107)
jaydeokar Nov 7, 2024
f080418
Update NP strict mode doc (#3125)
Pavani-Panakanti Nov 27, 2024
64748b4
adding email to send log bundle (#3134)
yash97 Dec 2, 2024
5daa885
Fix issues handling unmanaged ENIs with IPv6 only (#3122)
gavinbunney Dec 3, 2024
a9c972d
Bump go.uber.org/zap from 1.26.0 to 1.27.0
dependabot[bot] Dec 1, 2024
53f925d
Bump github.com/stretchr/testify from 1.9.0 to 1.10.0
dependabot[bot] Dec 1, 2024
5acb6f3
Bump github.com/onsi/gomega from 1.35.1 to 1.36.0
dependabot[bot] Dec 1, 2024
1b631d2
Bump github.com/prometheus/common from 0.60.0 to 0.60.1
dependabot[bot] Dec 1, 2024
04f0646
Update changelog from release-1.19 branch to master branch. (#3136)
orsenthil Dec 4, 2024
f617e68
Bump github.com/onsi/ginkgo/v2 from 2.20.1 to 2.22.0
dependabot[bot] Dec 4, 2024
2aa2944
Bump golang.org/x/sys from 0.26.0 to 0.27.0 in /test/agent
dependabot[bot] Dec 4, 2024
d64b8b4
Bump golang.org/x/sys from 0.27.0 to 0.28.0 in /test/agent
dependabot[bot] Dec 4, 2024
fb6d231
Fix KOps Integration Test (#3140)
dshehbaj Dec 7, 2024
2aea0fd
run make generate-limits to update the max pods file (#3141)
tzneal Dec 8, 2024
8dd2a5a
Update AWS VPC CNI to SDK V2 Update - master branch (#3070)
orsenthil Dec 9, 2024
5bcc561
Handle EKS Service for the Beta Endpoint. (#3143)
orsenthil Dec 10, 2024
ec4f86d
Adding multus v4.1.4 manifest (#3154)
jaydeokar Dec 13, 2024
2a63452
scripts integration: capture exit codes from both tests (#3149)
dshehbaj Dec 20, 2024
4ee9789
fix(test): add volume mount for docker-func-test target (#3160)
omerap12 Dec 25, 2024
cc14878
cni-metrics-helper metrics: do type assertion before type casting (#3…
dshehbaj Jan 2, 2025
235fa2a
Bump helm.sh/helm/v3 from 3.15.2 to 3.16.4
dependabot[bot] Jan 1, 2025
c88cb2c
Bump github.com/aws/aws-sdk-go-v2/service/autoscaling
dependabot[bot] Jan 1, 2025
66bd42b
Bump github.com/aws/aws-sdk-go-v2/service/iam from 1.38.1 to 1.38.3
dependabot[bot] Jan 1, 2025
57c1b38
Update Changelog and Version for CNI 1.19.2 (#3171)
orsenthil Jan 4, 2025
6f477a3
Bump github.com/aws/aws-sdk-go-v2/feature/ec2/imds (#3166)
dependabot[bot] Jan 4, 2025
f4b0a78
Add CNINode to cache filter (#3164)
dims Jan 5, 2025
94c4a15
fix: remove null creationTimestamp from CRD metadata (#3163)
omerap12 Jan 6, 2025
acc76bb
Fix issue with primary ENI ip lookup when an ENI has both IPv4 and IP…
orsenthil Jan 18, 2025
71eea69
Use awshttp client instead of smithy httpclient. (#3193)
orsenthil Feb 8, 2025
5b69f3e
retryOnConflict shouldnt' retry on NotFound (#3192)
haouc Feb 9, 2025
5eefbeb
Update awsutils.go (#3191)
git4example Feb 10, 2025
825978c
Bump github.com/aws/aws-sdk-go-v2/service/cloudwatch
dependabot[bot] Feb 1, 2025
48fb004
Bump github.com/aws/aws-sdk-go-v2/service/autoscaling
dependabot[bot] Feb 1, 2025
09742d7
Bump github.com/prometheus/common from 0.60.1 to 0.62.0
dependabot[bot] Feb 1, 2025
740c712
Bump golang.org/x/sys from 0.28.0 to 0.29.0 in /test/agent
dependabot[bot] Feb 1, 2025
3bf80b7
Bump golang.org/x/sys from 0.29.0 to 0.30.0 in /test/agent (#3198)
dependabot[bot] Feb 10, 2025
9f81995
Bump github.com/aws/aws-sdk-go-v2/service/cloudwatch (#3199)
dependabot[bot] Feb 11, 2025
0dc2b6b
Bump github.com/aws/aws-sdk-go-v2/service/autoscaling
dependabot[bot] Feb 11, 2025
e91a876
Bump github.com/samber/lo from 1.39.0 to 1.49.1 (#3184)
dependabot[bot] Feb 11, 2025
dce8a9c
Bump github.com/aws/aws-sdk-go-v2/service/eks from 1.52.1 to 1.58.0 (…
dependabot[bot] Feb 11, 2025
7e3950f
Add grpc call to fetch networkpolicymode from NP (#3202)
Pavani-Panakanti Feb 18, 2025
92a09cf
Changes to attach probes at pod start
Pavani-Panakanti Feb 3, 2025
f4f7a8f
minor error change
Pavani-Panakanti Feb 6, 2025
225fe1d
do not ret error on grpc dial
Pavani-Panakanti Feb 17, 2025
df21645
add dial with context
Pavani-Panakanti Feb 18, 2025
af32e99
update mocked grpc wrapper and unit tests
haouc Feb 18, 2025
28c99c9
improvement: add podmonitor for vpc metric collection (#3061)
aburan28 Feb 20, 2025
0988cdd
Fix print the error message in string instead of bytes. (#3208)
orsenthil Feb 21, 2025
be15077
update np standard mode doc (#3211)
Pavani-Panakanti Feb 21, 2025
0dccb22
config multus: add v4.1.4-eksbuild.3 (#3217)
dshehbaj Feb 28, 2025
7b288b8
update helm chart to ensure that created eniconfig name is always a s…
cheeseandcereal Mar 8, 2025
2532faf
Bump github.com/containerd/containerd from 1.7.23 to 1.7.27
dependabot[bot] Mar 18, 2025
ef32333
adding eni owner tag if cluster name is present (#3228)
yash97 Mar 20, 2025
0e8092b
only cache CNINode when SGP is in use (#3242)
oliviassss Mar 22, 2025
419eac5
Remove dependency on apiserver for IPAMD startup (#3243)
oliviassss Apr 1, 2025
852608e
Bump github.com/onsi/gomega from 1.36.0 to 1.36.2
dependabot[bot] Mar 19, 2025
de312d0
Bump golang.org/x/sys from 0.30.0 to 0.31.0 in /test/agent
dependabot[bot] Apr 1, 2025
d084d48
Skip configuring NP related if network_policy_enforcing_mode is not s…
Pavani-Panakanti Apr 14, 2025
ef246fc
Merged master into release-1.19 with master versions for all conflicts
oliviassss Apr 14, 2025
47106cf
remove unneeded metricsBindPort from charts (#3257)
oliviassss Apr 15, 2025
ee03c25
Merge branch 'master' into release-1.19
oliviassss Apr 15, 2025
4166a0f
bump up go version (#3259)
oliviassss Apr 15, 2025
5e682f1
Merge branch 'master' into release-1.19
oliviassss Apr 15, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .go-version
Original file line number Diff line number Diff line change
@@ -1 +1 @@
1.22.12
1.24.2
5 changes: 5 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -743,6 +743,11 @@ Default: `standard`

Network Policy agent now supports two modes for Network Policy enforcement - Strict and Standard. By default, the Amazon VPC CNI plugin for Kubernetes configures network policies for pods in parallel with the pod provisioning. In the `standard` mode, until all of the policies are configured for the new pod, containers in the new pod will start with a default allow policy. A default allow policy means that all ingress and egress traffic is allowed to and from the new pods. However, in the `strict` mode, a new pod will start with a default deny policy and all Egress and Ingress connections will be blocked till Network Policies are configured. In Strict Mode, you must have a network policy defined for every pod in your cluster. Host Networking pods are exempted from this requirement.

In standard mode, return traffic is always allowed for any packets that were initially sent under the default allow policy. However, once network policies are applied, the next outgoing packet will be evaluated against the active policies, and it will be allowed or denied accordingly.

If you remove the Network Policy Agent container from the aws-node DaemonSet, you must also ensure that NETWORK_POLICY_ENFORCING_MODE environment variable is not set.
Setting this value while the NP agent is absent can lead to failures during pod creation.

### VPC CNI Feature Matrix


Expand Down
2 changes: 1 addition & 1 deletion charts/aws-vpc-cni/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ The following table lists the configurable parameters for this chart and their d
| `originalMatchLabels` | Use the original daemonset matchLabels | `false` |
| `nameOverride` | Override the name of the chart | `aws-node` |
| `nodeAgent.enabled` | If the Node Agent container should be created | `true` |
| `nodeAgent.image.tag` | Image tag for Node Agent | `v1.1.6` |
| `nodeAgent.image.tag` | Image tag for Node Agent | `v1.1.5` |
| `nodeAgent.image.domain`| ECR repository domain | `amazonaws.com` |
| `nodeAgent.image.region`| ECR repository region to use. Should match your cluster | `us-west-2` |
| `nodeAgent.image.endpoint` | ECR repository endpoint to use. | `ecr` |
Expand Down
8 changes: 8 additions & 0 deletions charts/aws-vpc-cni/templates/daemonset.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -81,8 +81,13 @@ spec:
timeoutSeconds: {{ .Values.readinessProbeTimeoutSeconds }}
env:
{{- range $key, $value := .Values.env }}
{{- $skipKey := and (eq $key "NETWORK_POLICY_ENFORCING_MODE") (not $.Values.nodeAgent.enabled) }}
{{- if not $skipKey }}
- name: {{ $key }}
value: {{ $value | quote }}
{{- else }}
# Skipping NETWORK_POLICY_ENFORCING_MODE because nodeAgent is disabled
{{- end }}
{{- end }}
{{- with .Values.extraEnv }}
{{- toYaml .| nindent 12 }}
Expand Down Expand Up @@ -128,6 +133,9 @@ spec:
- name: aws-eks-nodeagent
image: {{ include "aws-vpc-cni.nodeAgentImage" . }}
imagePullPolicy: {{ .Values.nodeAgent.image.pullPolicy }}
ports:
- containerPort: {{ .Values.nodeAgent.metricsBindAddr}}
name: agentmetrics
env:
- name: MY_NODE_NAME
valueFrom:
Expand Down
2 changes: 1 addition & 1 deletion charts/aws-vpc-cni/templates/eniconfig.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
apiVersion: crd.k8s.amazonaws.com/v1alpha1
kind: ENIConfig
metadata:
name: {{ $key }}
name: "{{ $key }}"
spec:
{{- if $value.securityGroups }}
securityGroups:
Expand Down
40 changes: 40 additions & 0 deletions charts/aws-vpc-cni/templates/podmonitor.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
{{- if .Values.podMonitor.create }}
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: {{ include "aws-vpc-cni.fullname" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- with .Values.podMonitor.labels }}
{{- toYaml . | nindent 4 }}
{{- end }}
{{- with .Values.podMonitor.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
spec:
jobLabel: {{ include "aws-vpc-cni.fullname" . }}
namespaceSelector:
matchNames:
- {{ .Release.Namespace }}
podMetricsEndpoints:
- interval: {{ .Values.podMonitor.interval }}
path: /metrics
port: metrics
{{- with .Values.podMonitor.relabelings }}
relabelings:
{{- toYaml . | nindent 6 }}
{{- end }}
{{- if .Values.nodeAgent.enabled }}
- interval: {{ .Values.podMonitor.interval }}
path: /metrics
port: agentmetrics
{{- with .Values.podMonitor.relabelings }}
relabelings:
{{- toYaml . | nindent 6 }}
{{- end }}
{{- end }}
selector:
matchLabels:
k8s-app: aws-node
{{- end }}
16 changes: 15 additions & 1 deletion charts/aws-vpc-cni/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ init:
nodeAgent:
enabled: true
image:
tag: v1.1.6
tag: v1.1.5
domain: amazonaws.com
region: us-west-2
endpoint: ecr
Expand Down Expand Up @@ -231,3 +231,17 @@ eniConfig:
# id: subnet-789
# securityGroups:
# - sg-789

podMonitor:
# Create Prometheus podMonitor
create: false
# Annotations to add to the Prometheus podMonitor
annotations: {}
# Labels to add to the Prometheus podMonitor
labels: {}
# The interval to scrape metrics.
interval: 30s
# The timeout before a metrics scrape fails.
scrapeTimeout: 30s
# relabelings to apply to the podMonitor
relabelings: []
86 changes: 76 additions & 10 deletions cmd/aws-k8s-agent/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ package main

import (
"os"
"time"

"github.com/aws/amazon-vpc-cni-k8s/pkg/ipamd"
"github.com/aws/amazon-vpc-cni-k8s/pkg/k8sapi"
Expand All @@ -24,6 +25,7 @@ import (
"github.com/aws/amazon-vpc-cni-k8s/pkg/version"
"github.com/aws/amazon-vpc-cni-k8s/utils"
metrics "github.com/aws/amazon-vpc-cni-k8s/utils/prometheusmetrics"
"k8s.io/client-go/kubernetes"
)

const (
Expand All @@ -36,44 +38,108 @@ const (

// Environment variable to disable the IPAMD introspection endpoint on 61679
envDisableIntrospection = "DISABLE_INTROSPECTION"

restCfgTimeout = 5 * time.Second
pollInterval = 5 * time.Second
pollTimeout = 30 * time.Second
)

func main() {
os.Exit(_main())
}

// startBackgroundAPIServerCheck checks API connectivity in the background
func startBackgroundAPIServerCheck(ipamContext *ipamd.IPAMContext) {
go func() {
log := logger.Get()
log.Info("Starting background API server connectivity check...")

// Create a new client for API server check
restCfg, err := k8sapi.GetRestConfig()
if err != nil {
log.Errorf("Failed to get REST config for background API check: %v", err)
return
}
restCfg.Timeout = restCfgTimeout
clientSet, err := kubernetes.NewForConfig(restCfg)
if err != nil {
log.Errorf("Failed to create k8s client for background API check: %v", err)
return
}

// Keep checking until connection is established
for {
version, err := clientSet.Discovery().ServerVersion()
if err == nil {
log.Infof("API server connectivity established in background! Cluster Version is: %s", version.GitVersion)

// Update IPAM context with new API server connectivity
ipamContext.SetAPIServerConnectivity(true)

// Exit the goroutine after successful connection
log.Info("Background API server check completed successfully")
return
}

log.Debugf("Still waiting for API server connectivity in background: %v", err)
time.Sleep(pollInterval)
}
}()
}

func _main() int {
// Do not add anything before initializing logger
log := logger.Get()

log.Infof("Starting L-IPAMD %s ...", version.Version)
version.RegisterMetric()

enabledPodEni := ipamd.EnablePodENI()
enabledCustomNetwork := ipamd.UseCustomNetworkCfg()
enabledPodAnnotation := ipamd.EnablePodIPAnnotation()
withApiServer := false
// Check API Server Connectivity
if err := k8sapi.CheckAPIServerConnectivity(); err != nil {
log.Errorf("Failed to check API server connectivity: %s", err)
return 1
if enabledPodEni || enabledCustomNetwork || enabledPodAnnotation {
log.Info("SGP, custom networking or pod annotation feature is in use, waiting for API server connectivity to start IPAMD")
if err := k8sapi.CheckAPIServerConnectivity(); err != nil {
log.Errorf("Failed to check API server connectivity: %s", err)
return 1
} else {
log.Info("API server connectivity established.")
withApiServer = true
}
} else {
log.Infof("Waiting to connect API server for upto %s...", pollTimeout)
// Try a quick check first
if err := k8sapi.CheckAPIServerConnectivityWithTimeout(pollInterval, pollTimeout); err != nil {
log.Warn("Proceeding without API server connectivity, will run background API server connectivity check")
withApiServer = false
} else {
log.Info("API server connectivity established.")
withApiServer = true
}
}

// Create Kubernetes client for API server requests
k8sClient, err := k8sapi.CreateKubeClient(appName)
if err != nil {
log.Errorf("Failed to create kube client: %s", err)
return 1
}

// Create EventRecorder for use by IPAMD
if err := eventrecorder.Init(k8sClient); err != nil {
if err := eventrecorder.Init(k8sClient, withApiServer); err != nil {
log.Errorf("Failed to create event recorder: %s", err)
return 1
log.Warn("Skipping event recorder initialization")
}

ipamContext, err := ipamd.New(k8sClient)
ipamContext, err := ipamd.New(k8sClient, withApiServer)
if err != nil {
log.Errorf("Initialization failure: %v", err)
return 1
}

// If not connected to API server yet, start background checks
if !withApiServer {
startBackgroundAPIServerCheck(ipamContext)
}

// Pool manager
go ipamContext.StartNodeIPPoolManager()

Expand Down
3 changes: 2 additions & 1 deletion cmd/cni-metrics-helper/metrics/metrics.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ package metrics
import (
"bytes"
"context"
"errors"
"fmt"

"github.com/aws/aws-sdk-go-v2/aws"
Expand Down Expand Up @@ -358,7 +359,7 @@ func producePrometheusMetrics(t metricsTarget, families map[string]*dto.MetricFa
if len(prometheusCNIMetrics) == 0 {
errorMsg := "Skipping since prometheus mapping is missing"
t.getLogger().Infof(errorMsg)
return fmt.Errorf(errorMsg)
return errors.New(errorMsg)
}
for key, family := range families {
convertMetrics := convertDef[key]
Expand Down
21 changes: 14 additions & 7 deletions cmd/routed-eni-cni-plugin/cni.go
Original file line number Diff line number Diff line change
Expand Up @@ -282,15 +282,18 @@ func add(args *skel.CmdArgs, cniTypes typeswrapper.CNITYPES, grpcClient grpcwrap
result.Interfaces = append(result.Interfaces, dummyInterface)

// Set up a connection to the network policy agent
// Cx might have removed np container if they are not using network policies
// If we are not able to connect to np agent we do not return return error here. If NP agent grpc is not up
// and listening, NP agent will be in crash loop and we will catch the issue there
// NP container might have been removed if network policies are not being used
// If NETWORK_POLICY_ENFORCING_MODE is not set, we will not configure anything related to NP
if r.NetworkPolicyMode == "" {
log.Infof("NETWORK_POLICY_ENFORCING_MODE is not set")
return cniTypes.PrintResult(result, conf.CNIVersion)
}
ctx, cancel := context.WithTimeout(context.Background(), npAgentConnTimeout*time.Second) // Set timeout
defer cancel()
npConn, err := grpcClient.DialContext(ctx, npAgentAddress, grpc.WithTransportCredentials(insecure.NewCredentials()), grpc.WithBlock())
if err != nil {
log.Infof("Failed to connect to network policy agent: %v. Network Policy agent might not be running", err)
return cniTypes.PrintResult(result, conf.CNIVersion)
log.Errorf("Failed to connect to network policy agent: %v", err)
return errors.New("add cmd: failed to setup network policy")
}
defer npConn.Close()

Expand Down Expand Up @@ -451,13 +454,17 @@ func del(args *skel.CmdArgs, cniTypes typeswrapper.CNITYPES, grpcClient grpcwrap
log.Warnf("Container %s did not have a valid IP %s", args.ContainerID, r.IPv4Addr)
}

if r.NetworkPolicyMode == "" {
log.Infof("NETWORK_POLICY_ENFORCING_MODE is not set")
return nil
}
// Set up a connection to the network policy agent
ctx, cancel := context.WithTimeout(context.Background(), npAgentConnTimeout*time.Second) // Set timeout
defer cancel()
npConn, err := grpcClient.DialContext(ctx, npAgentAddress, grpc.WithTransportCredentials(insecure.NewCredentials()), grpc.WithBlock())
if err != nil {
log.Infof("Failed to connect to network policy agent: %v. Network Policy agent might not be running", err)
return nil
log.Errorf("Failed to connect to network policy agent: %v. Network Policy agent might not be running", err)
return errors.Wrap(err, "del cmd: failed to connect to network policy agent")
}
defer npConn.Close()
//Make a GRPC call for network policy agent
Expand Down
Loading
Loading