Skip to content

Adding support in CNI for managing multiple network interface card on the instance #3232

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: multi-nic-support
Choose a base branch
from

Conversation

jaydeokar
Copy link
Contributor

@jaydeokar jaydeokar commented Mar 12, 2025

What type of PR is this?
feature

Which issue does this PR fix?:

N/A

What does this PR do / Why do we need it?:
Amazon VPC CNI only manages Network Card 0 on all instances (including multicard supported instances). This restricts the bandwidth usage to only network card 0 as pod interfaces are only connected to NIC 0. With this change, CNI now starts to manage all the available network cards.
A pod can now request access to these network cards via an annotation to make use of all the available bandwidth. A pod which requires this support can do so via annotation. This will create interfaces in the pod namespace equal to the number of network cards available for use on the instance. The pods can then use these interfaces for their egress traffic which has certain BW requirements.

Describing major changes as the change log is significant

CNI

Add flow

  • CNI looks for annotations on the pod using the capabilities option. If the pod contains annotation k8s.amazonaws.com/nicConfig: multi-nic-attachment, CNI will ask for multiple IPs from network cards available on the instance.
  • IPAMD now returns a list of IPs (IPAddress field in GRPC response). The number of IPs returned equals the number of interfaces created by CNI inside the Pod.
  • Host veth name for multi-nic interfaces is generated by <pod-namespace>.<pod-name>.<index>. However we still retain the original naming convention for the first interface of the pod <pod-namespace>.<pod-name>
  • Container veth name for multi-nic interfaces has a prefix mNicIf eg mNicf1, mNicIf2 ...
  • IPAMD now returns route table number instead of just the device number. The Route table number is calculated as (network card index * max ENIs per NIC) + device-number + 1. Note Route table number 1 (device 0, network card 0) is the main route table for CNI
  • CNI now stores the Route table number of the interface in the container interface struct PciID, which is used to cleanup pod networking when IPAMD is down (del with prev result)
  • CNI runs a loop over the IP address and attaches to the Pod network.

Delete flow

  • CNI relies on IPAMD to return the number of IPs and interface to delete
  • If IPAMD is down, CNI uses the prev result to delete the Pod and Host Network

IPAMD

Node Init Flow

  1. IPAMD now initializes all the network card on the instance and creates datastore for each network card
  2. For IPv4 and IPv6, it initializes the host network for the primary interface
    a. For IPv6, it adds an additional rule in NAT table to exlude SNAT for link local traffic. This is required for DHCPv6 address
  3. IPAMD now includes checks for UpdatingCIDRRules when the VPC CIDR or the Security Group associated with the ENIs changes for IPv6 (which already existed for IPv4)
  4. On node init, IPAMD initializes all the available datastores and if the datastore is low, it creates interfaces for each network card. It has a new function for IPv6 ENIs where we create 1 ENI for each network card with IPv6 address and then assign an Prefix of size /80. If the creation of ENI fails (due to missing permission), we don't fail

Testing done on this change:
Yes, ran all the test suites on a single card instance and ran manual tests on a multicard- instance

Will this PR introduce any new dependencies?:
No

Will this break upgrades or downgrades? Has updating a running cluster been tested?:
Upgrades should be fine. Downgrade requires to delete the pods using multi-nic annotation and then downgrade otherwise the Pod IPs/Host Networking setup can leak

Does this change require updates to the CNI daemonset config files to work?:
No

Does this PR introduce any user-facing change?:
Yes, customers will now see interfaces attached to NIC > 0 on supported instances


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@jaydeokar jaydeokar force-pushed the multi-nic-ipamd-changes branch 4 times, most recently from 1762e22 to 6762628 Compare March 27, 2025 22:35
@jaydeokar jaydeokar changed the base branch from master to multi-nic-support March 27, 2025 22:36
@jaydeokar jaydeokar marked this pull request as ready for review March 27, 2025 22:36
@jaydeokar jaydeokar requested a review from a team as a code owner March 27, 2025 22:36
@jaydeokar jaydeokar changed the title Multi nic ipamd changes Adding support in CNI for managing multiple network interface card on the instance Mar 31, 2025
@jaydeokar jaydeokar force-pushed the multi-nic-ipamd-changes branch 2 times, most recently from 02bd18c to 5d6d399 Compare March 31, 2025 17:58
oliviassss and others added 3 commits March 31, 2025 17:17
* remove apiserver dependency for ipamd startup

* fix format issue in UT

* wait apiserver connectivty for pod annotate feature

* return maxPods value directly when parsing the local file
Bumps [github.com/onsi/gomega](https://github.com/onsi/gomega) from 1.36.0 to 1.36.2.
- [Release notes](https://github.com/onsi/gomega/releases)
- [Changelog](https://github.com/onsi/gomega/blob/master/CHANGELOG.md)
- [Commits](onsi/gomega@v1.36.0...v1.36.2)

---
updated-dependencies:
- dependency-name: github.com/onsi/gomega
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.30.0 to 0.31.0.
- [Commits](golang/sys@v0.30.0...v0.31.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
@jaydeokar jaydeokar force-pushed the multi-nic-ipamd-changes branch 2 times, most recently from 4e58726 to f8539d9 Compare April 11, 2025 04:52
Pavani-Panakanti and others added 6 commits April 14, 2025 13:44
…et (aws#3254)

* Skip configuring network policies if network_policy_enforcing_mode is not set

* make format and update chart

* fix vuln checks

* fix metrics agent and readme
Bumps [k8s.io/cli-runtime](https://github.com/kubernetes/cli-runtime) from 0.31.3 to 0.32.3.
- [Commits](kubernetes/cli-runtime@v0.31.3...v0.32.3)

---
updated-dependencies:
- dependency-name: k8s.io/cli-runtime
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jay Deokar <[email protected]>
@jaydeokar jaydeokar force-pushed the multi-nic-ipamd-changes branch 3 times, most recently from cb2d534 to 39339ad Compare April 26, 2025 02:23
@jaydeokar jaydeokar force-pushed the multi-nic-ipamd-changes branch from 39339ad to 4cc953e Compare April 29, 2025 15:25
@jaydeokar jaydeokar force-pushed the multi-nic-ipamd-changes branch from 4cc953e to 19af501 Compare April 29, 2025 16:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants