Skip to content

Commit 2f9bc00

Browse files
authored
[FRR]Adding fix for memory leak seen with BGP community (#18606)
Porting PR #18272 to 202311. Why I did it Porting fix for FRRouting/frr#15459 Adding patch for PR FRRouting/frr#15466 Work item tracking Microsoft ADO (number only): How I did it Ported the fix FRRouting/frr#15466 to 8.5.1 How to verify it Running the test_stress_route and ensure no memory leak
1 parent baf6f62 commit 2f9bc00

File tree

2 files changed

+304
-0
lines changed

2 files changed

+304
-0
lines changed
Lines changed: 303 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,303 @@
1+
From 2b8e4e4b93a78e5884e2e5b97050b4ea3843e2e0 Mon Sep 17 00:00:00 2001
2+
From: Donald Sharp <[email protected]>
3+
Date: Thu, 7 Mar 2024 22:18:18 -0500
4+
Subject: [PATCH 1/2] bgpd: Combined patch to clean up filter leaks
5+
6+
Customer has this valgrind trace:
7+
8+
Direct leak of 2829120 byte(s) in 70728 object(s) allocated from:
9+
0 in community_new ../bgpd/bgp_community.c:39
10+
1 in community_uniq_sort ../bgpd/bgp_community.c:170
11+
2 in route_set_community ../bgpd/bgp_routemap.c:2342
12+
3 in route_map_apply_ext ../lib/routemap.c:2673
13+
4 in subgroup_announce_check ../bgpd/bgp_route.c:2367
14+
5 in subgroup_process_announce_selected ../bgpd/bgp_route.c:2914
15+
6 in group_announce_route_walkcb ../bgpd/bgp_updgrp_adv.c:199
16+
7 in hash_walk ../lib/hash.c:285
17+
8 in update_group_af_walk ../bgpd/bgp_updgrp.c:2061
18+
9 in group_announce_route ../bgpd/bgp_updgrp_adv.c:1059
19+
10 in bgp_process_main_one ../bgpd/bgp_route.c:3221
20+
11 in bgp_process_wq ../bgpd/bgp_route.c:3221
21+
12 in work_queue_run ../lib/workqueue.c:282
22+
23+
The above leak detected by valgrind was from a screenshot so I copied it
24+
by hand. Any mistakes in line numbers are purely from my transcription.
25+
Additionally this is against a slightly modified 8.5.1 version of FRR.
26+
Code inspection of 8.5.1 -vs- latest master shows the same problem
27+
exists. Code should be able to be followed from there to here.
28+
29+
What is happening:
30+
31+
There is a route-map being applied that modifes the outgoing community
32+
to a peer. This is saved in the attr copy created in
33+
subgroup_process_announce_selected. This community pointer is not
34+
interned. So the community->refcount is still 0. Normally when
35+
a prefix is announced, the attr and the prefix are placed on a
36+
adjency out structure where the attribute is interned. This will
37+
cause the community to be saved in the community hash list as well.
38+
In a non-normal operation when the decision to send is aborted after
39+
the route-map application, the attribute is just dropped and the
40+
pointer to the community is just dropped too, leading to situations
41+
where the memory is leaked. The usage of bgp suppress-fib would
42+
would be a case where the community is caused to be leaked.
43+
Additionally the previous commit where an unsuppress-map is used
44+
to modify the outgoing attribute but since unsuppress-map was
45+
not considered part of outgoing policy the attribute would be dropped as
46+
well. This pointer drop also extends to any dynamically allocated
47+
memory saved by the attribute pointer that was not interned yet as well.
48+
49+
So let's modify the return case where the decision is made to
50+
not send the prefix to the peer to always just flush the attribute
51+
to ensure memory is not leaked.
52+
53+
Fixes: #15459
54+
Signed-off-by: Donald Sharp <[email protected]>
55+
---
56+
bgpd/bgp_conditional_adv.c | 5 +++--
57+
bgpd/bgp_route.c | 19 +++++++++++------
58+
bgpd/bgp_updgrp.h | 2 +-
59+
bgpd/bgp_updgrp_adv.c | 53 ++++++++++++++++++++++++++--------------------
60+
4 files changed, 47 insertions(+), 32 deletions(-)
61+
62+
diff --git a/bgpd/bgp_conditional_adv.c b/bgpd/bgp_conditional_adv.c
63+
index 4ad00ed121..89ea85ff46 100644
64+
--- a/bgpd/bgp_conditional_adv.c
65+
+++ b/bgpd/bgp_conditional_adv.c
66+
@@ -134,8 +134,9 @@ static void bgp_conditional_adv_routes(struct peer *peer, afi_t afi,
67+
if (update_type == UPDATE_TYPE_ADVERTISE &&
68+
subgroup_announce_check(dest, pi, subgrp, dest_p,
69+
&attr, &advmap_attr)) {
70+
- bgp_adj_out_set_subgroup(dest, subgrp, &attr,
71+
- pi);
72+
+ if (!bgp_adj_out_set_subgroup(dest, subgrp,
73+
+ &attr, pi))
74+
+ bgp_attr_flush(&attr);
75+
} else {
76+
/* If default originate is enabled for
77+
* the peer, do not send explicit
78+
diff --git a/bgpd/bgp_route.c b/bgpd/bgp_route.c
79+
index a7a5c9849a..157bfa8a2b 100644
80+
--- a/bgpd/bgp_route.c
81+
+++ b/bgpd/bgp_route.c
82+
@@ -2917,16 +2917,23 @@ void subgroup_process_announce_selected(struct update_subgroup *subgrp,
83+
* in FIB, then it is advertised
84+
*/
85+
if (advertise) {
86+
- if (!bgp_check_withdrawal(bgp, dest))
87+
- bgp_adj_out_set_subgroup(
88+
- dest, subgrp, &attr, selected);
89+
- else
90+
+ if (!bgp_check_withdrawal(bgp, dest)) {
91+
+ if (!bgp_adj_out_set_subgroup(
92+
+ dest, subgrp, &attr,
93+
+ selected))
94+
+ bgp_attr_flush(&attr);
95+
+ } else {
96+
bgp_adj_out_unset_subgroup(
97+
dest, subgrp, 1, addpath_tx_id);
98+
- }
99+
- } else
100+
+ bgp_attr_flush(&attr);
101+
+ }
102+
+ } else
103+
+ bgp_attr_flush(&attr);
104+
+ } else {
105+
bgp_adj_out_unset_subgroup(dest, subgrp, 1,
106+
addpath_tx_id);
107+
+ bgp_attr_flush(&attr);
108+
+ }
109+
}
110+
111+
/* If selected is NULL we must withdraw the path using addpath_tx_id */
112+
diff --git a/bgpd/bgp_updgrp.h b/bgpd/bgp_updgrp.h
113+
index e27c1e7b67..b7b6aa07e9 100644
114+
--- a/bgpd/bgp_updgrp.h
115+
+++ b/bgpd/bgp_updgrp.h
116+
@@ -458,7 +458,7 @@ extern struct bgp_adj_out *bgp_adj_out_alloc(struct update_subgroup *subgrp,
117+
extern void bgp_adj_out_remove_subgroup(struct bgp_dest *dest,
118+
struct bgp_adj_out *adj,
119+
struct update_subgroup *subgrp);
120+
-extern void bgp_adj_out_set_subgroup(struct bgp_dest *dest,
121+
+extern bool bgp_adj_out_set_subgroup(struct bgp_dest *dest,
122+
struct update_subgroup *subgrp,
123+
struct attr *attr,
124+
struct bgp_path_info *path);
125+
diff --git a/bgpd/bgp_updgrp_adv.c b/bgpd/bgp_updgrp_adv.c
126+
index af8ef751da..301a8b267e 100644
127+
--- a/bgpd/bgp_updgrp_adv.c
128+
+++ b/bgpd/bgp_updgrp_adv.c
129+
@@ -454,7 +454,7 @@ bgp_advertise_clean_subgroup(struct update_subgroup *subgrp,
130+
return next;
131+
}
132+
133+
-void bgp_adj_out_set_subgroup(struct bgp_dest *dest,
134+
+bool bgp_adj_out_set_subgroup(struct bgp_dest *dest,
135+
struct update_subgroup *subgrp, struct attr *attr,
136+
struct bgp_path_info *path)
137+
{
138+
@@ -474,7 +474,7 @@ void bgp_adj_out_set_subgroup(struct bgp_dest *dest,
139+
bgp = SUBGRP_INST(subgrp);
140+
141+
if (DISABLE_BGP_ANNOUNCE)
142+
- return;
143+
+ return false;
144+
145+
/* Look for adjacency information. */
146+
adj = adj_lookup(
147+
@@ -490,7 +490,7 @@ void bgp_adj_out_set_subgroup(struct bgp_dest *dest,
148+
bgp_addpath_id_for_peer(peer, afi, safi,
149+
&path->tx_addpath));
150+
if (!adj)
151+
- return;
152+
+ return false;
153+
154+
subgrp->pscount++;
155+
}
156+
@@ -529,7 +529,7 @@ void bgp_adj_out_set_subgroup(struct bgp_dest *dest,
157+
* will never be able to coalesce the 3rd peer down
158+
*/
159+
subgrp->version = MAX(subgrp->version, dest->version);
160+
- return;
161+
+ return false;
162+
}
163+
164+
if (adj->adv)
165+
@@ -576,6 +576,8 @@ void bgp_adj_out_set_subgroup(struct bgp_dest *dest,
166+
bgp_adv_fifo_add_tail(&subgrp->sync->update, adv);
167+
168+
subgrp->version = MAX(subgrp->version, dest->version);
169+
+
170+
+ return true;
171+
}
172+
173+
/* The only time 'withdraw' will be false is if we are sending
174+
@@ -668,7 +670,7 @@ void subgroup_announce_table(struct update_subgroup *subgrp,
175+
{
176+
struct bgp_dest *dest;
177+
struct bgp_path_info *ri;
178+
- struct attr attr;
179+
+ struct attr attr = {0};
180+
struct peer *peer;
181+
afi_t afi;
182+
safi_t safi;
183+
@@ -715,19 +717,24 @@ void subgroup_announce_table(struct update_subgroup *subgrp,
184+
&attr, NULL)) {
185+
/* Check if route can be advertised */
186+
if (advertise) {
187+
- if (!bgp_check_withdrawal(bgp, dest))
188+
- bgp_adj_out_set_subgroup(
189+
- dest, subgrp, &attr,
190+
- ri);
191+
- else
192+
+ if (!bgp_check_withdrawal(bgp, dest)) {
193+
+ if (!bgp_adj_out_set_subgroup(
194+
+ dest, subgrp, &attr,
195+
+ ri))
196+
+ bgp_attr_flush(&attr);
197+
+ } else {
198+
+ bgp_attr_flush(&attr);
199+
bgp_adj_out_unset_subgroup(
200+
dest, subgrp, 1,
201+
bgp_addpath_id_for_peer(
202+
peer, afi,
203+
safi_rib,
204+
&ri->tx_addpath));
205+
- }
206+
+ }
207+
+ } else
208+
+ bgp_attr_flush(&attr);
209+
} else {
210+
+ bgp_attr_flush(&attr);
211+
/* If default originate is enabled for
212+
* the peer, do not send explicit
213+
* withdraw. This will prevent deletion
214+
@@ -947,18 +954,18 @@ void subgroup_default_originate(struct update_subgroup *subgrp, int withdraw)
215+
if (dest) {
216+
for (pi = bgp_dest_get_bgp_path_info(dest); pi;
217+
pi = pi->next) {
218+
- if (CHECK_FLAG(pi->flags, BGP_PATH_SELECTED))
219+
- if (subgroup_announce_check(
220+
- dest, pi, subgrp,
221+
- bgp_dest_get_prefix(dest),
222+
- &attr, NULL)) {
223+
- struct attr *default_attr =
224+
- bgp_attr_intern(&attr);
225+
-
226+
- bgp_adj_out_set_subgroup(
227+
- dest, subgrp,
228+
- default_attr, pi);
229+
- }
230+
+ if (!CHECK_FLAG(pi->flags, BGP_PATH_SELECTED))
231+
+ continue;
232+
+
233+
+ if (subgroup_announce_check(
234+
+ dest, pi, subgrp,
235+
+ bgp_dest_get_prefix(dest), &attr,
236+
+ NULL)) {
237+
+ if (!bgp_adj_out_set_subgroup(
238+
+ dest, subgrp, &attr, pi))
239+
+ bgp_attr_flush(&attr);
240+
+ } else
241+
+ bgp_attr_flush(&attr);
242+
}
243+
bgp_dest_unlock_node(dest);
244+
}
245+
--
246+
2.14.1
247+
248+
249+
From 761907075520aa3fae70a8d18fa717a24d3cbd65 Mon Sep 17 00:00:00 2001
250+
From: Donald Sharp <[email protected]>
251+
Date: Wed, 13 Mar 2024 10:26:58 -0400
252+
Subject: [PATCH 2/2] bgpd: Ensure that the correct aspath is free'd
253+
254+
Currently in subgroup_default_originate the attr.aspath
255+
is set in bgp_attr_default_set, which hashs the aspath
256+
and creates a refcount for it. If this is a withdraw
257+
the subgroup_announce_check and bgp_adj_out_set_subgroup
258+
is called which will intern the attribute. This will
259+
cause the the attr.aspath to be set to a new value
260+
finally at the bottom of the function it intentionally
261+
uninterns the aspath which is not the one that was
262+
created for this function. This reduces the other
263+
aspath's refcount by 1 and if a clear bgp * is issued
264+
fast enough the aspath for that will be removed
265+
and the system will crash.
266+
267+
Signed-off-by: Donald Sharp <[email protected]>
268+
---
269+
bgpd/bgp_updgrp_adv.c | 4 +++-
270+
1 file changed, 3 insertions(+), 1 deletion(-)
271+
272+
diff --git a/bgpd/bgp_updgrp_adv.c b/bgpd/bgp_updgrp_adv.c
273+
index 301a8b267e..75e377f9a1 100644
274+
--- a/bgpd/bgp_updgrp_adv.c
275+
+++ b/bgpd/bgp_updgrp_adv.c
276+
@@ -817,6 +817,7 @@ void subgroup_default_originate(struct update_subgroup *subgrp, int withdraw)
277+
struct bgp *bgp;
278+
struct attr attr;
279+
struct attr *new_attr = &attr;
280+
+ struct aspath *aspath;
281+
struct prefix p;
282+
struct peer *from;
283+
struct bgp_dest *dest;
284+
@@ -854,6 +855,7 @@ void subgroup_default_originate(struct update_subgroup *subgrp, int withdraw)
285+
/* make coverity happy */
286+
assert(attr.aspath);
287+
288+
+ aspath = attr.aspath;
289+
attr.med = 0;
290+
attr.flag |= ATTR_FLAG_BIT(BGP_ATTR_MULTI_EXIT_DISC);
291+
292+
@@ -1009,7 +1011,7 @@ void subgroup_default_originate(struct update_subgroup *subgrp, int withdraw)
293+
}
294+
}
295+
296+
- aspath_unintern(&attr.aspath);
297+
+ aspath_unintern(&aspath);
298+
}
299+
300+
/*
301+
--
302+
2.14.1
303+

src/sonic-frr/patch/series

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,3 +33,4 @@ cross-compile-changes.patch
3333
0034-fpm-Use-vrf_id-for-vrf-not-tabled_id.patch
3434
0035-fpm-ignore-route-from-default-table.patch
3535
0036-Add-support-of-bgp-l3vni-evpn.patch
36+
0037-bgp-community-memory-leak-fix.patch

0 commit comments

Comments
 (0)