Skip to content

[202411][FRR] fix FRR mgmtd losing configuration issue #22183

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 3, 2025

Conversation

yxieca
Copy link
Contributor

@yxieca yxieca commented Mar 30, 2025

Why I did it

(this is cherry-pick of #22182)

mgmtd configuration management has an issue where any session can clean up outstanding configuration upon destruction.

When a long-lived session is taking configuration changes, and another short-lived session which never took any configuration closes, the outstanding configuration would be lost because the configuration clearing doesn't have protection during session closing.

Work item tracking
  • Microsoft ADO (number only): 31872199

How I did it

This change keeps track if a session has received any configuration, and if the configuration has been applied or cleared.

The outstanding configuration should be applied or cleared before session closure (assertion).

When clearing the outstanding session structure, only attempt to clear configuration when the closing session has outstanding configurations.

How to verify it

Run config reload test on a platform with wimpy CPU, with turning on excessive debug output, the repro rate was >80%. With the fix 100% passing. The fix also has assertion to make sure that no configuration leak would go undetected.

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211
  • 202305

Tested branch (Please provide the tested image version)

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

@yxieca yxieca requested a review from lguohan as a code owner March 30, 2025 18:03
@mssonicbld
Copy link
Collaborator

/azp run Azure.sonic-buildimage

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@yxieca yxieca force-pushed the frr-mgmtd-202411 branch from 075280a to fdb484a Compare March 31, 2025 00:48
@mssonicbld
Copy link
Collaborator

/azp run Azure.sonic-buildimage

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

mgmtd configuration management has an issue where any session
can clean up outstanding configuration upon destruction.

When a long-lived session is taking configuration changes, and
another short-lived session which never took any configuration
closes, the outstanding configuration would be lost because
the configuration clearing doesn't have protection during session
closing.

This change keeps track if a session has received any configuration,
and if the configuration has been applied or cleared.

The outstanding configuration should be applied or cleared before
session closure (assertion).

When clearing the outstanding session structure, only attempt to
clear configuration when the closing session has outstanding
configurations.

Signed-off-by: Ying Xie <[email protected]>
@yxieca yxieca force-pushed the frr-mgmtd-202411 branch from fdb484a to 38cac9c Compare April 1, 2025 17:45
@mssonicbld
Copy link
Collaborator

/azp run Azure.sonic-buildimage

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@kperumalbfn kperumalbfn merged commit 6921735 into sonic-net:202411 Apr 3, 2025
19 checks passed
@yxieca yxieca deleted the frr-mgmtd-202411 branch April 3, 2025 16:21
DavidZagury pushed a commit to DavidZagury/sonic-buildimage that referenced this pull request Apr 28, 2025
…-net#1052)

Code sync sonic-net/sonic-buildimage:202411 => 202412

```
*   903637b (HEAD -> code-sync-202412, origin/code-sync-202412) r12f 250424:0714 - Merge remote-tracking branch 'base/202411' into code-sync-202412
|\  
| * 367199f (base/202411) mssonicbld 250423:1901 - [submodule] Update submodule sonic-swss to the latest HEAD automatically (sonic-net#22412)
| * 0da8b10 mssonicbld 250422:1901 - [submodule] Update submodule sonic-sairedis to the latest HEAD automatically (sonic-net#22397)
| * 4700d35 mssonicbld 250418:1901 - [submodule] Update submodule sonic-swss to the latest HEAD automatically (sonic-net#22364)
| * 688b708 mssonicbld 250418:1901 - [CI]Add a pipeline to generates daily, successful virtual SONiC images (sonic-net#22368)
| * c70a09c mssonicbld 250416:1601 - [submodule] Update submodule sonic-utilities to the latest HEAD automatically (sonic-net#22329)
| * 1711475 mssonicbld 250416:0101 - [ci] Stop building slave docker for jessie and march (sonic-net#22331)
| * 8f40740 Saikrishna Arcot 250414:1755 - [202411] Update to Linux 6.1.123 (sonic-net#21736)
| * 2fd97fe mssonicbld 250411:1902 - [chassis-packet] Allow Fallback Route to get programmed on Downstream LC (sonic-net#21833)
| * f2a76d5 Volodymyr Samotiy 250410:1845 - [202411][Mellanox] Integrate HW-MGMT Version 7.0040.2207 (sonic-net#22202)
| * d9fd131 mssonicbld 250408:0920 - [submodule] Update submodule sonic-gnmi to the latest HEAD automatically (sonic-net#22258)
| * bbd6e3c mssonicbld 250407:2200 - extend frr reconnect bmp retry interval from 1-2s into 10-15s. (sonic-net#22251)
| * 3807de0 Aravind-Subbaroyan 250406:1803 - Update cisco-8000.ini (sonic-net#22247)
| * bd3a419 Feng-msft 250407:1038 - Split frr_bmp feature switch for turn on FRR side bmp tunneling via Liquid (sonic-net#22243)
| * b754560 mssonicbld 250404:1901 - [submodule] Update submodule sonic-utilities to the latest HEAD automatically (sonic-net#22232)
| * fd4d058 zitingguo-ms 250404:0123 - [Broadcom] Upgrade xgs SAI to 12.3.6.2 (sonic-net#22219)
| * 6042d69 mssonicbld 250404:0020 - [submodule] Update submodule sonic-sairedis to the latest HEAD automatically (sonic-net#22227)
| * b9a0031 mssonicbld 250404:0020 - [submodule] Update submodule sonic-swss to the latest HEAD automatically (sonic-net#22228)
| * 2bb570a sschlafman 250403:0854 - [202411] Changed SKU name to Mellanox-SN4280-C48-202411 (sonic-net#22121)
| * a06b315 sschlafman 250403:0854 - [202411] Add new T1 Mellanox-SN4280-O8V40 SKU for 202411 (sonic-net#22108)
| * 6921735 Ying Xie 250403:0716 - [FRR] fix FRR mgmtd losing configuration issue (sonic-net#22183)
| * f7cbdd9 Stepan Blyshchak 250402:2200 - [orchagent.sh] mask SIGHUP before starting orchagent (sonic-net#22208)
| * 4765324 sudhanshukumar22 250401:2234 -  [202411][FRR] Port Fix from FRR community for issue sonic-net#18493
| * 874cbf4 Sai Kiran 250401:1546 - [docker-ptf] Port changes from master (sonic-net#22185)
```

---------

Co-authored-by: Sai Kiran <[email protected]>
Co-authored-by: sudhanshukumar22 <[email protected]>
Co-authored-by: Stepan Blyshchak <[email protected]>
Co-authored-by: Ying Xie <[email protected]>
Co-authored-by: sschlafman <[email protected]>
Co-authored-by: mssonicbld <[email protected]>
Co-authored-by: zitingguo-ms <[email protected]>
Co-authored-by: Feng-msft <[email protected]>
Co-authored-by: Aravind-Subbaroyan <[email protected]>
Co-authored-by: Volodymyr Samotiy <[email protected]>
Co-authored-by: Saikrishna Arcot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants