Skip to content

Commit 3b95205

Browse files
authored
[patch]: Introduce sysctl param arp_evict_no_carrier (sonic-net#293)
This PR is to backport the following commits to kernel v5.10: torvalds/linux@fcdb44d The commit above is to introduce a new sysctl parameter `arp_evict_no_carrier`. Signed-off-by: Longxiang Lyu <[email protected]> Signed-off-by: Longxiang Lyu <[email protected]>
1 parent 443253f commit 3b95205

File tree

2 files changed

+220
-0
lines changed

2 files changed

+220
-0
lines changed
Lines changed: 217 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,217 @@
1+
From fcdb44d08a95003c3d040aecdee286156ec6f34e Mon Sep 17 00:00:00 2001
2+
From: James Prestwood <[email protected]>
3+
Date: Mon, 1 Nov 2021 10:36:28 -0700
4+
Subject: [PATCH] net: arp: introduce arp_evict_nocarrier sysctl parameter
5+
6+
This change introduces a new sysctl parameter, arp_evict_nocarrier.
7+
When set (default) the ARP cache will be cleared on a NOCARRIER event.
8+
This new option has been defaulted to '1' which maintains existing
9+
behavior.
10+
11+
Clearing the ARP cache on NOCARRIER is relatively new, introduced by:
12+
13+
commit 859bd2ef1fc1110a8031b967ee656c53a6260a76
14+
Author: David Ahern <[email protected]>
15+
Date: Thu Oct 11 20:33:49 2018 -0700
16+
17+
net: Evict neighbor entries on carrier down
18+
19+
The reason for this changes is to prevent the ARP cache from being
20+
cleared when a wireless device roams. Specifically for wireless roams
21+
the ARP cache should not be cleared because the underlying network has not
22+
changed. Clearing the ARP cache in this case can introduce significant
23+
delays sending out packets after a roam.
24+
25+
A user reported such a situation here:
26+
27+
https://lore.kernel.org/linux-wireless/CACsRnHWa47zpx3D1oDq9JYnZWniS8yBwW1h0WAVZ6vrbwL_S0w@mail.gmail.com/
28+
29+
After some investigation it was found that the kernel was holding onto
30+
packets until ARP finished which resulted in this 1 second delay. It
31+
was also found that the first ARP who-has was never responded to,
32+
which is actually what caues the delay. This change is more or less
33+
working around this behavior, but again, there is no reason to clear
34+
the cache on a roam anyways.
35+
36+
As for the unanswered who-has, we know the packet made it OTA since
37+
it was seen while monitoring. Why it never received a response is
38+
unknown. In any case, since this is a problem on the AP side of things
39+
all that can be done is to work around it until it is solved.
40+
41+
Some background on testing/reproducing the packet delay:
42+
43+
Hardware:
44+
- 2 access points configured for Fast BSS Transition (Though I don't
45+
see why regular reassociation wouldn't have the same behavior)
46+
- Wireless station running IWD as supplicant
47+
- A device on network able to respond to pings (I used one of the APs)
48+
49+
Procedure:
50+
- Connect to first AP
51+
- Ping once to establish an ARP entry
52+
- Start a tcpdump
53+
- Roam to second AP
54+
- Wait for operstate UP event, and note the timestamp
55+
- Start pinging
56+
57+
Results:
58+
59+
Below is the tcpdump after UP. It was recorded the interface went UP at
60+
10:42:01.432875.
61+
62+
10:42:01.461871 ARP, Request who-has 192.168.254.1 tell 192.168.254.71, length 28
63+
10:42:02.497976 ARP, Request who-has 192.168.254.1 tell 192.168.254.71, length 28
64+
10:42:02.507162 ARP, Reply 192.168.254.1 is-at ac:86:74:55:b0:20, length 46
65+
10:42:02.507185 IP 192.168.254.71 > 192.168.254.1: ICMP echo request, id 52792, seq 1, length 64
66+
10:42:02.507205 IP 192.168.254.71 > 192.168.254.1: ICMP echo request, id 52792, seq 2, length 64
67+
10:42:02.507212 IP 192.168.254.71 > 192.168.254.1: ICMP echo request, id 52792, seq 3, length 64
68+
10:42:02.507219 IP 192.168.254.71 > 192.168.254.1: ICMP echo request, id 52792, seq 4, length 64
69+
10:42:02.507225 IP 192.168.254.71 > 192.168.254.1: ICMP echo request, id 52792, seq 5, length 64
70+
10:42:02.507232 IP 192.168.254.71 > 192.168.254.1: ICMP echo request, id 52792, seq 6, length 64
71+
10:42:02.515373 IP 192.168.254.1 > 192.168.254.71: ICMP echo reply, id 52792, seq 1, length 64
72+
10:42:02.521399 IP 192.168.254.1 > 192.168.254.71: ICMP echo reply, id 52792, seq 2, length 64
73+
10:42:02.521612 IP 192.168.254.1 > 192.168.254.71: ICMP echo reply, id 52792, seq 3, length 64
74+
10:42:02.521941 IP 192.168.254.1 > 192.168.254.71: ICMP echo reply, id 52792, seq 4, length 64
75+
10:42:02.522419 IP 192.168.254.1 > 192.168.254.71: ICMP echo reply, id 52792, seq 5, length 64
76+
10:42:02.523085 IP 192.168.254.1 > 192.168.254.71: ICMP echo reply, id 52792, seq 6, length 64
77+
78+
You can see the first ARP who-has went out very quickly after UP, but
79+
was never responded to. Nearly a second later the kernel retries and
80+
gets a response. Only then do the ping packets go out. If an ARP entry
81+
is manually added prior to UP (after the cache is cleared) it is seen
82+
that the first ping is never responded to, so its not only an issue with
83+
ARP but with data packets in general.
84+
85+
As mentioned prior, the wireless interface was also monitored to verify
86+
the ping/ARP packet made it OTA which was observed to be true.
87+
88+
Signed-off-by: James Prestwood <[email protected]>
89+
Reviewed-by: David Ahern <[email protected]>
90+
Signed-off-by: Jakub Kicinski <[email protected]>
91+
---
92+
Documentation/networking/ip-sysctl.rst | 9 +++++++++
93+
include/linux/inetdevice.h | 2 ++
94+
include/uapi/linux/ip.h | 1 +
95+
include/uapi/linux/sysctl.h | 1 +
96+
net/ipv4/arp.c | 11 ++++++++++-
97+
net/ipv4/devinet.c | 4 ++++
98+
6 files changed, 27 insertions(+), 1 deletion(-)
99+
100+
diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
101+
index 16b8bf72feaf..18fde4ed7a5e 100644
102+
--- a/Documentation/networking/ip-sysctl.rst
103+
+++ b/Documentation/networking/ip-sysctl.rst
104+
@@ -1611,6 +1611,15 @@ arp_accept - BOOLEAN
105+
gratuitous arp frame, the arp table will be updated regardless
106+
if this setting is on or off.
107+
108+
+arp_evict_nocarrier - BOOLEAN
109+
+ Clears the ARP cache on NOCARRIER events. This option is important for
110+
+ wireless devices where the ARP cache should not be cleared when roaming
111+
+ between access points on the same network. In most cases this should
112+
+ remain as the default (1).
113+
+
114+
+ - 1 - (default): Clear the ARP cache on NOCARRIER events
115+
+ - 0 - Do not clear ARP cache on NOCARRIER events
116+
+
117+
mcast_solicit - INTEGER
118+
The maximum number of multicast probes in INCOMPLETE state,
119+
when the associated hardware address is unknown. Defaults
120+
diff --git a/include/linux/inetdevice.h b/include/linux/inetdevice.h
121+
index a038feb63f23..518b484a7f07 100644
122+
--- a/include/linux/inetdevice.h
123+
+++ b/include/linux/inetdevice.h
124+
@@ -133,6 +133,8 @@ static inline void ipv4_devconf_setall(struct in_device *in_dev)
125+
#define IN_DEV_ARP_ANNOUNCE(in_dev) IN_DEV_MAXCONF((in_dev), ARP_ANNOUNCE)
126+
#define IN_DEV_ARP_IGNORE(in_dev) IN_DEV_MAXCONF((in_dev), ARP_IGNORE)
127+
#define IN_DEV_ARP_NOTIFY(in_dev) IN_DEV_MAXCONF((in_dev), ARP_NOTIFY)
128+
+#define IN_DEV_ARP_EVICT_NOCARRIER(in_dev) IN_DEV_ANDCONF((in_dev), \
129+
+ ARP_EVICT_NOCARRIER)
130+
131+
struct in_ifaddr {
132+
struct hlist_node hash;
133+
diff --git a/include/uapi/linux/ip.h b/include/uapi/linux/ip.h
134+
index e42d13b55cf3..e00bbb9c47bb 100644
135+
--- a/include/uapi/linux/ip.h
136+
+++ b/include/uapi/linux/ip.h
137+
@@ -169,6 +169,7 @@ enum
138+
IPV4_DEVCONF_DROP_UNICAST_IN_L2_MULTICAST,
139+
IPV4_DEVCONF_DROP_GRATUITOUS_ARP,
140+
IPV4_DEVCONF_BC_FORWARDING,
141+
+ IPV4_DEVCONF_ARP_EVICT_NOCARRIER,
142+
__IPV4_DEVCONF_MAX
143+
};
144+
145+
diff --git a/include/uapi/linux/sysctl.h b/include/uapi/linux/sysctl.h
146+
index 1e05d3caa712..6a3b194c50fe 100644
147+
--- a/include/uapi/linux/sysctl.h
148+
+++ b/include/uapi/linux/sysctl.h
149+
@@ -482,6 +482,7 @@ enum
150+
NET_IPV4_CONF_PROMOTE_SECONDARIES=20,
151+
NET_IPV4_CONF_ARP_ACCEPT=21,
152+
NET_IPV4_CONF_ARP_NOTIFY=22,
153+
+ NET_IPV4_CONF_ARP_EVICT_NOCARRIER=23,
154+
};
155+
156+
/* /proc/sys/net/ipv4/netfilter */
157+
diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c
158+
index 922dd73e5740..857a144b1ea9 100644
159+
--- a/net/ipv4/arp.c
160+
+++ b/net/ipv4/arp.c
161+
@@ -1247,6 +1247,8 @@ static int arp_netdev_event(struct notifier_block *this, unsigned long event,
162+
{
163+
struct net_device *dev = netdev_notifier_info_to_dev(ptr);
164+
struct netdev_notifier_change_info *change_info;
165+
+ struct in_device *in_dev;
166+
+ bool evict_nocarrier;
167+
168+
switch (event) {
169+
case NETDEV_CHANGEADDR:
170+
@@ -1257,7 +1259,14 @@ static int arp_netdev_event(struct notifier_block *this, unsigned long event,
171+
change_info = ptr;
172+
if (change_info->flags_changed & IFF_NOARP)
173+
neigh_changeaddr(&arp_tbl, dev);
174+
- if (!netif_carrier_ok(dev))
175+
+
176+
+ in_dev = __in_dev_get_rtnl(dev);
177+
+ if (!in_dev)
178+
+ evict_nocarrier = true;
179+
+ else
180+
+ evict_nocarrier = IN_DEV_ARP_EVICT_NOCARRIER(in_dev);
181+
+
182+
+ if (evict_nocarrier && !netif_carrier_ok(dev))
183+
neigh_carrier_down(&arp_tbl, dev);
184+
break;
185+
default:
186+
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
187+
index f4468980b675..ec73a0d52d3e 100644
188+
--- a/net/ipv4/devinet.c
189+
+++ b/net/ipv4/devinet.c
190+
@@ -75,6 +75,7 @@ static struct ipv4_devconf ipv4_devconf = {
191+
[IPV4_DEVCONF_SHARED_MEDIA - 1] = 1,
192+
[IPV4_DEVCONF_IGMPV2_UNSOLICITED_REPORT_INTERVAL - 1] = 10000 /*ms*/,
193+
[IPV4_DEVCONF_IGMPV3_UNSOLICITED_REPORT_INTERVAL - 1] = 1000 /*ms*/,
194+
+ [IPV4_DEVCONF_ARP_EVICT_NOCARRIER - 1] = 1,
195+
},
196+
};
197+
198+
@@ -87,6 +88,7 @@ static struct ipv4_devconf ipv4_devconf_dflt = {
199+
[IPV4_DEVCONF_ACCEPT_SOURCE_ROUTE - 1] = 1,
200+
[IPV4_DEVCONF_IGMPV2_UNSOLICITED_REPORT_INTERVAL - 1] = 10000 /*ms*/,
201+
[IPV4_DEVCONF_IGMPV3_UNSOLICITED_REPORT_INTERVAL - 1] = 1000 /*ms*/,
202+
+ [IPV4_DEVCONF_ARP_EVICT_NOCARRIER - 1] = 1,
203+
},
204+
};
205+
206+
@@ -2532,6 +2534,8 @@ static struct devinet_sysctl_table {
207+
DEVINET_SYSCTL_RW_ENTRY(ARP_IGNORE, "arp_ignore"),
208+
DEVINET_SYSCTL_RW_ENTRY(ARP_ACCEPT, "arp_accept"),
209+
DEVINET_SYSCTL_RW_ENTRY(ARP_NOTIFY, "arp_notify"),
210+
+ DEVINET_SYSCTL_RW_ENTRY(ARP_EVICT_NOCARRIER,
211+
+ "arp_evict_nocarrier"),
212+
DEVINET_SYSCTL_RW_ENTRY(PROXY_ARP_PVLAN, "proxy_arp_pvlan"),
213+
DEVINET_SYSCTL_RW_ENTRY(FORCE_IGMP_VERSION,
214+
"force_igmp_version"),
215+
--
216+
2.17.1
217+

patch/series

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,9 @@ kernel-compat-always-include-linux-compat.h-from-net-compat.patch
3636
# Backport from 5.15
3737
0001-x86-platform-Increase-maximum-GPIO-number-for-X86_64.patch
3838

39+
# Backport from 5.16
40+
0001-net-arp-introduce-arp_evict_nocarrier-sysctl-paramet.patch
41+
3942
# Backport from 5.19
4043
0001-net-ipv6-Introduce-accept_unsolicited_na-knob-to-imp.patch
4144
0002-net-ipv6-Expand-and-rename-accept_unsolicited_na-to-.patch

0 commit comments

Comments
 (0)