Skip to content

Commit 5cee770

Browse files
authored
Ensure full functionality of AntreaProxy with proxyAll enabled when kube-proxy presents (#6308)
To ensure full functionality of AntreaProxy, except for handling ClusterIP from Nodes, even when kube-proxy in iptables mode is present, certain key changes are implemented when proxyAll is enabled: The jump rules for the chains managed by Antrea, `ANTREA-PREROUTING` and `ANTREA-OUTPUT` in nat table, are installed by inserting instead of appending to bypass the chain `KUBE-SERVICES` performing Service DNAT managed by kube-proxy. Antrea ensures that the jump rules take precedence over those managed by kube-proxy. The iptables rules of nat table chain `ANTREA-PREROUTING` are like below, and they are similar in chain `ANTREA-OUTPUT`. ``` -A ANTREA-PREROUTING -m comment --comment "Antrea: DNAT external to NodePort packets" -m set --match-set ANTREA-NODEPORT-IP dst,dst -j DNAT --to-destination 169.254.0.252 ``` The rule is to DNAT NodePort traffic, bypassing chain `KUBE-SERVICES`. The iptables rules of raw table chains ANTREA-PREROUTING / ANTREA-OUTPUT are like below: ``` 1. -A ANTREA-PREROUTING -m comment --comment "Antrea: do not track incoming encapsulation packets" -m udp -p udp --dport 6081 -m addrtype --dst-type LOCAL -j NOTRACK 2. -A ANTREA-PREROUTING -m comment --comment "Antrea: drop Pod multicast traffic forwarded via underlay network" -m set --match-set CLUSTER-NODE-IP src -d 224.0.0.0/4 -j DROP 3. -A ANTREA-PREROUTING -m comment --comment "Antrea: do not track request packets destined to external IPs" -m set --match-set ANTREA-EXTERNAL-IP dst -j NOTRACK 4. -A ANTREA-PREROUTING -m comment --comment "Antrea: do not track reply packets sourced from external IPs" -m set --match-set ANTREA-EXTERNAL-IP src -j NOTRACK 5. -A ANTREA-OUTPUT -m comment --comment "Antrea: do not track request packets destined to external IPs" -m set --match-set ANTREA-EXTERNAL-IP dst -j NOTRACK ``` - Rules 1-2 are not new rules. - Rule 3 is to bypass conntrack for packets sourced from external and destined to externalIPs, which also results in bypassing the chains managed by Antrea Proxy and kube-proxy in nat table. - Rule 4 is to bypass conntrack for packets sourced from externalIPs, which also results in bypassing the chains managed by Antrea Proxy and kube-proxy in nat table. - Rule 5 is to bypass conntrack for packets sourced from local and destined to externalIPs, which also results in bypassing the chains managed by Antrea Proxy and kube-proxy in nat table. The following are the benchmark results of a LoadBalancer Service configured with DSR mode. The results of TCP_STREAM and TCP_RR (single TCP connection) are almost the same as that before. The result of TCP_CRR (multiple TCP connections) performs better than before. One reason should be that conntrack is skipped for LoadBalancer Services. ``` Test v2.0 proxyAll Dev proxyAll Delta TCP_STREAM 4933.97 4918.35 -0.32% TCP_RR 8095.49 8032.4 -0.78% TCP_CRR 1645.66 1888.93 +14.79% ``` Signed-off-by: Hongliang Liu <[email protected]>
1 parent 42a0aaa commit 5cee770

File tree

14 files changed

+706
-387
lines changed

14 files changed

+706
-387
lines changed

.github/workflows/kind.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -170,6 +170,7 @@ jobs:
170170
--coverage \
171171
--encap-mode encap \
172172
--proxy-all \
173+
--no-kube-proxy \
173174
--feature-gates LoadBalancerModeDSR=true \
174175
--load-balancer-mode dsr \
175176
--node-ipam

ci/kind/test-e2e-kind.sh

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ _usage="Usage: $0 [--encap-mode <mode>] [--ip-family <v4|v6|dual>] [--coverage]
2828
--feature-gates A comma-separated list of key=value pairs that describe feature gates, e.g. AntreaProxy=true,Egress=false.
2929
--run Run only tests matching the regexp.
3030
--proxy-all Enables Antrea proxy with all Service support.
31+
--no-kube-proxy Don't deploy kube-proxy.
3132
--load-balancer-mode LoadBalancer mode.
3233
--node-ipam Enables Antrea NodeIPAM.
3334
--multicast Enables Multicast.
@@ -72,6 +73,7 @@ mode=""
7273
ipfamily="v4"
7374
feature_gates=""
7475
proxy_all=false
76+
no_kube_proxy=false
7577
load_balancer_mode=""
7678
node_ipam=false
7779
multicast=false
@@ -106,6 +108,10 @@ case $key in
106108
proxy_all=true
107109
shift
108110
;;
111+
--no-kube-proxy)
112+
no_kube_proxy=true
113+
shift
114+
;;
109115
--load-balancer-mode)
110116
load_balancer_mode="$2"
111117
shift 2
@@ -299,7 +305,7 @@ function setup_cluster {
299305
echoerr "invalid value for --ip-family \"$ipfamily\", expected \"v4\" or \"v6\""
300306
exit 1
301307
fi
302-
if $proxy_all; then
308+
if $no_kube_proxy; then
303309
args="$args --no-kube-proxy"
304310
fi
305311
if $node_ipam; then
@@ -353,7 +359,7 @@ function run_test {
353359
cat $CH_OPERATOR_YML | docker exec -i kind-control-plane dd of=/root/clickhouse-operator-install-bundle.yml
354360
fi
355361

356-
if $proxy_all; then
362+
if $no_kube_proxy; then
357363
apiserver=$(docker exec -i kind-control-plane kubectl get endpoints kubernetes --no-headers | awk '{print $2}')
358364
if $coverage; then
359365
docker exec -i kind-control-plane sed -i.bak -E "s/^[[:space:]]*[#]?kubeAPIServerOverride[[:space:]]*:[[:space:]]*[a-z\"]+[[:space:]]*$/ kubeAPIServerOverride: \"$apiserver\"/" /root/antrea-coverage.yml /root/antrea-ipsec-coverage.yml

docs/antrea-proxy.md

Lines changed: 16 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -42,13 +42,22 @@ the introduction of `proxyAll`, Antrea relied on userspace kube-proxy, which is
4242
no longer actively maintained by the K8s community and is slower than other
4343
kube-proxy backends.
4444

45-
Note that on Linux, even when `proxyAll` is enabled, kube-proxy will usually
46-
take priority and will keep handling NodePort Service traffic (unless the source
47-
is a Pod, which is pretty unusual as Pods typically access Services by
48-
ClusterIP). This is because kube-proxy rules typically come before the rules
49-
installed by AntreaProxy to redirect traffic to OVS. When kube-proxy is not
50-
deployed or is removed from the cluster, AntreaProxy will then handle all
51-
Service traffic.
45+
Note that on Linux, before Antrea v2.1, when `proxyAll` is enabled, kube-proxy
46+
will usually take priority over AntreaProxy and will keep handling all kinds of
47+
Service traffic (unless the source is a Pod, which is pretty unusual as Pods
48+
typically access Services by ClusterIP). This is because kube-proxy rules typically
49+
come before the rules installed by AntreaProxy to redirect traffic to OVS. When
50+
kube-proxy is not deployed or is removed from the cluster, AntreaProxy will then
51+
handle all Service traffic.
52+
53+
Starting with Antrea v2.1, when `proxyAll` is enabled, AntreaProxy will handle
54+
Service traffic destined to NodePort, LoadBalancerIP and ExternalIP, even if
55+
kube-proxy is present. This benefits users who want to take advantage of
56+
AntreaProxy's advanced features, such as Direct Server Return (DSR) mode, but
57+
lack control over kube-proxy's installation. This is accomplished by
58+
prioritizing the rules installed by AntreaProxy over those installed by
59+
kube-proxy, thus it works only with kube-proxy iptables mode. Support for other
60+
kube-proxy modes may be added in the future.
5261

5362
### Removing kube-proxy
5463

pkg/agent/proxy/proxier.go

Lines changed: 17 additions & 62 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,6 @@ import (
3232
"k8s.io/apimachinery/pkg/runtime"
3333
"k8s.io/apimachinery/pkg/selection"
3434
apimachinerytypes "k8s.io/apimachinery/pkg/types"
35-
"k8s.io/apimachinery/pkg/util/sets"
3635
coreinformers "k8s.io/client-go/informers/core/v1"
3736
discoveryinformers "k8s.io/client-go/informers/discovery/v1"
3837
clientset "k8s.io/client-go/kubernetes"
@@ -121,13 +120,6 @@ type proxier struct {
121120
serviceHealthServer healthcheck.ServiceHealthServer
122121
numLocalEndpoints map[apimachinerytypes.NamespacedName]int
123122

124-
// serviceIPRouteReferences tracks the references of Service IP routes. The key is the Service IP and the value is
125-
// the set of ServiceInfo strings. Because a Service could have multiple ports and each port will generate a
126-
// ServicePort (which is the unit of the processing), a Service IP route may be required by several ServicePorts.
127-
// With the references, we install a route exactly once as long as it's used by any ServicePorts and uninstall it
128-
// exactly once when it's no longer used by any ServicePorts.
129-
// It applies to ClusterIP and LoadBalancerIP.
130-
serviceIPRouteReferences map[string]sets.Set[string]
131123
// syncedOnce returns true if the proxier has synced rules at least once.
132124
syncedOnce bool
133125
syncedOnceMutex sync.RWMutex
@@ -569,10 +561,10 @@ func (p *proxier) installNodePortService(localGroupID, clusterGroupID binding.Gr
569561
IsNested: false, // Unsupported for NodePort
570562
IsDSR: false, // Unsupported because external traffic has been DNAT'd in host network before it's forwarded to OVS.
571563
}); err != nil {
572-
return fmt.Errorf("failed to install NodePort load balancing flows: %w", err)
564+
return fmt.Errorf("failed to install NodePort load balancing OVS flows: %w", err)
573565
}
574-
if err := p.routeClient.AddNodePort(p.nodePortAddresses, svcPort, protocol); err != nil {
575-
return fmt.Errorf("failed to install NodePort traffic redirecting rules: %w", err)
566+
if err := p.routeClient.AddNodePortConfigs(p.nodePortAddresses, svcPort, protocol); err != nil {
567+
return fmt.Errorf("failed to install NodePort traffic redirecting routing configurations: %w", err)
576568
}
577569
return nil
578570
}
@@ -588,8 +580,8 @@ func (p *proxier) uninstallNodePortService(svcPort uint16, protocol binding.Prot
588580
if err := p.ofClient.UninstallServiceFlows(svcIP, svcPort, protocol); err != nil {
589581
return fmt.Errorf("failed to remove NodePort load balancing flows: %w", err)
590582
}
591-
if err := p.routeClient.DeleteNodePort(p.nodePortAddresses, svcPort, protocol); err != nil {
592-
return fmt.Errorf("failed to remove NodePort traffic redirecting rules: %w", err)
583+
if err := p.routeClient.DeleteNodePortConfigs(p.nodePortAddresses, svcPort, protocol); err != nil {
584+
return fmt.Errorf("failed to remove NodePort traffic redirecting routing configurations: %w", err)
593585
}
594586
return nil
595587
}
@@ -618,10 +610,10 @@ func (p *proxier) installExternalIPService(svcInfoStr string,
618610
IsNested: false, // Unsupported for ExternalIP
619611
IsDSR: features.DefaultFeatureGate.Enabled(features.LoadBalancerModeDSR) && loadBalancerMode == agentconfig.LoadBalancerModeDSR,
620612
}); err != nil {
621-
return fmt.Errorf("failed to install ExternalIP load balancing flows: %w", err)
613+
return fmt.Errorf("failed to install ExternalIP load balancing OVS flows: %w", err)
622614
}
623-
if err := p.addRouteForServiceIP(svcInfoStr, ip, p.routeClient.AddExternalIPRoute); err != nil {
624-
return fmt.Errorf("failed to install ExternalIP traffic redirecting routes: %w", err)
615+
if err := p.routeClient.AddExternalIPConfigs(svcInfoStr, ip); err != nil {
616+
return fmt.Errorf("failed to install ExternalIP load balancing routing configurations: %w", err)
625617
}
626618
}
627619
return nil
@@ -631,10 +623,10 @@ func (p *proxier) uninstallExternalIPService(svcInfoStr string, externalIPString
631623
for _, externalIP := range externalIPStrings {
632624
ip := net.ParseIP(externalIP)
633625
if err := p.ofClient.UninstallServiceFlows(ip, svcPort, protocol); err != nil {
634-
return fmt.Errorf("failed to remove ExternalIP load balancing flows: %w", err)
626+
return fmt.Errorf("failed to remove ExternalIP load balancing OVS flows: %w", err)
635627
}
636-
if err := p.deleteRouteForServiceIP(svcInfoStr, ip, p.routeClient.DeleteExternalIPRoute); err != nil {
637-
return fmt.Errorf("failed to remove ExternalIP traffic redirecting routes: %w", err)
628+
if err := p.routeClient.DeleteExternalIPConfigs(svcInfoStr, ip); err != nil {
629+
return fmt.Errorf("failed to remove ExternalIP traffic redirecting routing configurations: %w", err)
638630
}
639631
}
640632
return nil
@@ -665,71 +657,35 @@ func (p *proxier) installLoadBalancerService(svcInfoStr string,
665657
IsNested: false, // Unsupported for LoadBalancerIP
666658
IsDSR: features.DefaultFeatureGate.Enabled(features.LoadBalancerModeDSR) && loadBalancerMode == agentconfig.LoadBalancerModeDSR,
667659
}); err != nil {
668-
return fmt.Errorf("failed to install LoadBalancer load balancing flows: %w", err)
660+
return fmt.Errorf("failed to install LoadBalancerIP load balancing OVS flows: %w", err)
669661
}
670662
if p.proxyAll {
671-
if err := p.addRouteForServiceIP(svcInfoStr, ip, p.routeClient.AddExternalIPRoute); err != nil {
672-
return fmt.Errorf("failed to install LoadBalancer traffic redirecting routes: %w", err)
663+
if err := p.routeClient.AddExternalIPConfigs(svcInfoStr, ip); err != nil {
664+
return fmt.Errorf("failed to install LoadBalancerIP traffic redirecting routing configurations: %w", err)
673665
}
674666
}
675667
}
676668
}
677669
return nil
678670
}
679671

680-
func (p *proxier) addRouteForServiceIP(svcInfoStr string, ip net.IP, addRouteFn func(net.IP) error) error {
681-
ipStr := ip.String()
682-
references, exists := p.serviceIPRouteReferences[ipStr]
683-
// If the IP was not referenced by any Service port, install a route for it.
684-
// Otherwise, just reference it.
685-
if !exists {
686-
if err := addRouteFn(ip); err != nil {
687-
return err
688-
}
689-
references = sets.New[string](svcInfoStr)
690-
p.serviceIPRouteReferences[ipStr] = references
691-
} else {
692-
references.Insert(svcInfoStr)
693-
}
694-
return nil
695-
}
696-
697672
func (p *proxier) uninstallLoadBalancerService(svcInfoStr string, loadBalancerIPStrings []string, svcPort uint16, protocol binding.Protocol) error {
698673
for _, ingress := range loadBalancerIPStrings {
699674
if ingress != "" {
700675
ip := net.ParseIP(ingress)
701676
if err := p.ofClient.UninstallServiceFlows(ip, svcPort, protocol); err != nil {
702-
return fmt.Errorf("failed to remove LoadBalancer load balancing flows: %w", err)
677+
return fmt.Errorf("failed to remove LoadBalancerIP load balancing OVS flows: %w", err)
703678
}
704679
if p.proxyAll {
705-
if err := p.deleteRouteForServiceIP(svcInfoStr, ip, p.routeClient.DeleteExternalIPRoute); err != nil {
706-
return fmt.Errorf("failed to remove LoadBalancer traffic redirecting routes: %w", err)
680+
if err := p.routeClient.DeleteExternalIPConfigs(svcInfoStr, ip); err != nil {
681+
return fmt.Errorf("failed to remove LoadBalancerIP traffic redirecting routing configurations: %w", err)
707682
}
708683
}
709684
}
710685
}
711686
return nil
712687
}
713688

714-
func (p *proxier) deleteRouteForServiceIP(svcInfoStr string, ip net.IP, deleteRouteFn func(net.IP) error) error {
715-
ipStr := ip.String()
716-
references, exists := p.serviceIPRouteReferences[ipStr]
717-
// If the IP was not referenced by this Service port, skip it.
718-
if exists && references.Has(svcInfoStr) {
719-
// Delete the IP only if this Service port is the last one referencing it.
720-
// Otherwise, just dereference it.
721-
if references.Len() == 1 {
722-
if err := deleteRouteFn(ip); err != nil {
723-
return err
724-
}
725-
delete(p.serviceIPRouteReferences, ipStr)
726-
} else {
727-
references.Delete(svcInfoStr)
728-
}
729-
}
730-
return nil
731-
}
732-
733689
func (p *proxier) installServices() {
734690
for svcPortName, svcPort := range p.serviceMap {
735691
svcInfo := svcPort.(*types.ServiceInfo)
@@ -1454,7 +1410,6 @@ func newProxier(
14541410
endpointsInstalledMap: types.EndpointsMap{},
14551411
endpointsMap: types.EndpointsMap{},
14561412
endpointReferenceCounter: map[string]int{},
1457-
serviceIPRouteReferences: map[string]sets.Set[string]{},
14581413
nodeLabels: map[string]string{},
14591414
serviceStringMap: map[string]k8sproxy.ServicePortName{},
14601415
groupCounter: groupCounter,

0 commit comments

Comments
 (0)