Skip to content

Commit 23c3103

Browse files
Jozsef Kadlecsikgregkh
authored andcommitted
netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test
[ Upstream commit 28628fa ] Linkui Xiao reported that there's a race condition when ipset swap and destroy is called, which can lead to crash in add/del/test element operations. Swap then destroy are usual operations to replace a set with another one in a production system. The issue can in some cases be reproduced with the script: ipset create hash_ip1 hash:net family inet hashsize 1024 maxelem 1048576 ipset add hash_ip1 172.20.0.0/16 ipset add hash_ip1 192.168.0.0/16 iptables -A INPUT -m set --match-set hash_ip1 src -j ACCEPT while [ 1 ] do # ... Ongoing traffic... ipset create hash_ip2 hash:net family inet hashsize 1024 maxelem 1048576 ipset add hash_ip2 172.20.0.0/16 ipset swap hash_ip1 hash_ip2 ipset destroy hash_ip2 sleep 0.05 done In the race case the possible order of the operations are CPU0 CPU1 ip_set_test ipset swap hash_ip1 hash_ip2 ipset destroy hash_ip2 hash_net_kadt Swap replaces hash_ip1 with hash_ip2 and then destroy removes hash_ip2 which is the original hash_ip1. ip_set_test was called on hash_ip1 and because destroy removed it, hash_net_kadt crashes. The fix is to force ip_set_swap() to wait for all readers to finish accessing the old set pointers by calling synchronize_rcu(). The first version of the patch was written by Linkui Xiao <[email protected]>. v2: synchronize_rcu() is moved into ip_set_swap() in order not to burden ip_set_destroy() unnecessarily when all sets are destroyed. v3: Florian Westphal pointed out that all netfilter hooks run with rcu_read_lock() held and em_ipset.c wraps the entire ip_set_test() in rcu read lock/unlock pair. So there's no need to extend the rcu read locked area in ipset itself. Closes: https://lore.kernel.org/all/[email protected]/ Reported by: Linkui Xiao <[email protected]> Signed-off-by: Jozsef Kadlecsik <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
1 parent 22a59e5 commit 23c3103

File tree

1 file changed

+7
-7
lines changed

1 file changed

+7
-7
lines changed

net/netfilter/ipset/ip_set_core.c

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,8 @@ MODULE_ALIAS_NFNL_SUBSYS(NFNL_SUBSYS_IPSET);
6161
ip_set_dereference((inst)->ip_set_list)[id]
6262
#define ip_set_ref_netlink(inst,id) \
6363
rcu_dereference_raw((inst)->ip_set_list)[id]
64+
#define ip_set_dereference_nfnl(p) \
65+
rcu_dereference_check(p, lockdep_nfnl_is_held(NFNL_SUBSYS_IPSET))
6466

6567
/* The set types are implemented in modules and registered set types
6668
* can be found in ip_set_type_list. Adding/deleting types is
@@ -708,15 +710,10 @@ __ip_set_put_netlink(struct ip_set *set)
708710
static struct ip_set *
709711
ip_set_rcu_get(struct net *net, ip_set_id_t index)
710712
{
711-
struct ip_set *set;
712713
struct ip_set_net *inst = ip_set_pernet(net);
713714

714-
rcu_read_lock();
715-
/* ip_set_list itself needs to be protected */
716-
set = rcu_dereference(inst->ip_set_list)[index];
717-
rcu_read_unlock();
718-
719-
return set;
715+
/* ip_set_list and the set pointer need to be protected */
716+
return ip_set_dereference_nfnl(inst->ip_set_list)[index];
720717
}
721718

722719
static inline void
@@ -1397,6 +1394,9 @@ static int ip_set_swap(struct sk_buff *skb, const struct nfnl_info *info,
13971394
ip_set(inst, to_id) = from;
13981395
write_unlock_bh(&ip_set_ref_lock);
13991396

1397+
/* Make sure all readers of the old set pointers are completed. */
1398+
synchronize_rcu();
1399+
14001400
return 0;
14011401
}
14021402

0 commit comments

Comments
 (0)