Skip to content

Commit 0e6a55e

Browse files
[fast-reboot] Backup database after syncd/swss stopped (sonic-net#3342)
- What I did Backup DB after syncd and swss are stopped. I observed an issue with fast-reboot that in a rare circumstances a queued FDB event might be written to ASIC_DB by a thread inside syncd after a call to FLUSHDB ASIC_DB was made. That left ASIC_DB only with one record about that FDB entry and caused syncd to crash at start: Mar 15 13:28:42.765108 sonic NOTICE syncd#SAI: :- Syncd: syncd started Mar 15 13:28:42.765268 sonic NOTICE syncd#SAI: :- onSyncdStart: performing hard reinit since COLD start was performed Mar 15 13:28:42.765451 sonic NOTICE syncd#SAI: :- readAsicState: loaded 1 switches Mar 15 13:28:42.765465 sonic NOTICE syncd#SAI: :- readAsicState: switch VID: oid:0x21000000000000 Mar 15 13:28:42.765465 sonic NOTICE syncd#SAI: :- readAsicState: read asic state took 0.000205 sec Mar 15 13:28:42.766364 sonic NOTICE syncd#SAI: :- onSyncdStart: on syncd start took 0.001097 sec Mar 15 13:28:42.766376 sonic ERR syncd#SAI: :- run: Runtime error during syncd init: map::at Mar 15 13:28:42.766376 sonic NOTICE syncd#SAI: :- sendShutdownRequest: sending switch_shutdown_request notification to OA for switch: oid:0x0 Mar 15 13:28:42.766518 sonic NOTICE syncd#SAI: :- sendShutdownRequestAfterException: notification send successfully - How I did it Backup DB after syncd/swss have stopped. - How to verify it Run fast-reboot. Signed-off-by: Stepan Blyschak <[email protected]>
1 parent c51758d commit 0e6a55e

File tree

1 file changed

+15
-14
lines changed

1 file changed

+15
-14
lines changed

scripts/fast-reboot

+15-14
Original file line numberDiff line numberDiff line change
@@ -244,6 +244,19 @@ function wait_for_pre_shutdown_complete_or_fail()
244244
function backup_database()
245245
{
246246
debug "Backing up database ..."
247+
248+
if [[ "$REBOOT_TYPE" = "fastfast-reboot" || "$REBOOT_TYPE" = "fast-reboot" ]]; then
249+
# Advanced reboot: dump state to host disk
250+
sonic-db-cli ASIC_DB FLUSHDB > /dev/null
251+
sonic-db-cli COUNTERS_DB FLUSHDB > /dev/null
252+
sonic-db-cli FLEX_COUNTER_DB FLUSHDB > /dev/null
253+
fi
254+
255+
if [[ "$REBOOT_TYPE" = "fast-reboot" ]]; then
256+
# Flush RESTAP_DB in fast-reboot to avoid stale status
257+
sonic-db-cli RESTAPI_DB FLUSHDB > /dev/null
258+
fi
259+
247260
# Dump redis content to a file 'dump.rdb' in warmboot directory
248261
mkdir -p $WARM_DIR
249262
# Delete keys in stateDB except FDB_TABLE|*, MIRROR_SESSION_TABLE|*, WARM_RESTART_ENABLE_TABLE|*, FG_ROUTE_TABLE|*
@@ -806,23 +819,11 @@ for service in ${SERVICES_TO_STOP}; do
806819
wait_for_pre_shutdown_complete_or_fail
807820
fi
808821
809-
if [[ "$REBOOT_TYPE" = "fastfast-reboot" || "$REBOOT_TYPE" = "fast-reboot" ]]; then
810-
# Advanced reboot: dump state to host disk
811-
sonic-db-cli ASIC_DB FLUSHDB > /dev/null
812-
sonic-db-cli COUNTERS_DB FLUSHDB > /dev/null
813-
sonic-db-cli FLEX_COUNTER_DB FLUSHDB > /dev/null
814-
fi
815-
816-
if [[ "$REBOOT_TYPE" = "fast-reboot" ]]; then
817-
# Flush RESTAP_DB in fast-reboot to avoid stale status
818-
sonic-db-cli RESTAPI_DB FLUSHDB > /dev/null
819-
fi
820-
821-
backup_database
822-
823822
fi
824823
done
825824
825+
backup_database
826+
826827
# Stop the docker container engine. Otherwise we will have a broken docker storage
827828
systemctl stop docker.service || debug "Ignore stopping docker service error $?"
828829

0 commit comments

Comments
 (0)