-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[System logs]: Fix logrotate bugs #535
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
missingok | ||
notifempty | ||
compress | ||
delaycompress | ||
sharedscripts | ||
postrotate | ||
invoke-rc.d rsyslog rotate > /dev/null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nvoke-rc.d rsyslog rotate > /dev/null [](start = 9, length = 37)
in /etc/init.d/rsyslog, rotate is doing the kill HUP. why not use invoke-rc.d rsyslog rotate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was not working properly. It was confirmed as a bug present in our version of init-system-helpers: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=672218.
There seem to be two workarounds:
service rsyslog rotate >/dev/null 2>&1 || true
kill -HUP $(cat /var/run/rsyslogd.pid)
I have tested the latter solution. I did not try the former, as the || true
seems a bit hacky. I can test it if you'd like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/etc/init.d/rsyslog rotate > /dev/null
was solved this problem to me some time ago, not sure about now.
* src/sonic-utilities ee56d54...cb0e745 (11): > sonic_utilities: Support for DOM Threshold values for EEPROM dump (#545) > [portstat] Fix portstat show RX_UTIL over 100% for 100G (#563) > sonic_installer: fix read-only filesystem support for firmware update (#565) > Revert "show acl table command output should show binding column correctly even with single port (#447)" (#589) > show acl table command output should show binding column correctly even with single port (#447) > [config] Do no stop or restart dependent services (#582) > sfpshow: prevent 'show int trans eeprom --dom' from crashing (#567) > [warm-reboot] add docker upgrade --warm option and roll back support (#559) > [ecnconfig] Validate input WRED parameters (#579) > [sonic-utilities] Add fstrim to reboot (#535) > Fixing the expected neighbor command due to change in output format under sonic-buildimage/pull/3036 (#584)
Signed-off-by: Harish Venkatraman <[email protected]>
* 5337490 2019-11-22 | Send port status notification when creating hostif interface (sonic-net#535) [Kamil Cudnik] Signed-off-by: Guohan Lu <[email protected]>
* 5337490 2019-11-22 | Send port status notification when creating hostif interface (#535) [Kamil Cudnik] Signed-off-by: Guohan Lu <[email protected]>
* 5337490 2019-11-22 | Send port status notification when creating hostif interface (sonic-net#535) [Kamil Cudnik] Signed-off-by: Guohan Lu <[email protected]>
* Pospone QueueMap initialization until activation of counters * Generate queue maps only for front panel ports * Create empty buffer lists by default
[schema] Add EXP to TC map config table (sonic-net#537) [CI]: Swap the python code coverage report with the cpp report (sonic-net#544) Remove SWIG generated files from coverage report (sonic-net#542) Update database defintions for PINs / P4Runtime (sonic-net#536) [ci]: Support code coverage (sonic-net#539) Fix the option missing in kernel config issue (sonic-net#541) Add SRV6 APP tables (sonic-net#538) [schema] Rename CBF config tables (sonic-net#535)
[schema] Add EXP to TC map config table (#537) [CI]: Swap the python code coverage report with the cpp report (#544) Remove SWIG generated files from coverage report (#542) Update database defintions for PINs / P4Runtime (#536) [ci]: Support code coverage (#539) Fix the option missing in kernel config issue (#541) Add SRV6 APP tables (#538) [schema] Rename CBF config tables (#535)
…D automatically (#20957) #### Why I did it src/sonic-platform-daemons ``` * 57f0448 - (HEAD -> master, origin/master, origin/HEAD) [lag_id] Add lagid to free_list when LC absent for 30 minutes (#542) (9 hours ago) [Marty Y. Lok] * 3624cb7 - [stormond] Added new dynamic field 'last_sync_time' to STATE_DB (#535) (21 hours ago) [Ashwin Srinivasan] * 0431fa3 - Addition of DPU Chassis for thermalctld (#564) (24 hours ago) [Gagan Punathil Ellath] ``` #### How I did it #### How to verify it #### Description for the changelog
…D automatically (sonic-net#20957) #### Why I did it src/sonic-platform-daemons ``` * 57f0448 - (HEAD -> master, origin/master, origin/HEAD) [lag_id] Add lagid to free_list when LC absent for 30 minutes (sonic-net#542) (9 hours ago) [Marty Y. Lok] * 3624cb7 - [stormond] Added new dynamic field 'last_sync_time' to STATE_DB (sonic-net#535) (21 hours ago) [Ashwin Srinivasan] * 0431fa3 - Addition of DPU Chassis for thermalctld (sonic-net#564) (24 hours ago) [Gagan Punathil Ellath] ``` #### How I did it #### How to verify it #### Description for the changelog
…1070 [action] [PR:21070] [sonic-mgmt-docker-image] Support ptf dataplane packet poll with multiple ptf nn agents connection
…D automatically (sonic-net#20957) #### Why I did it src/sonic-platform-daemons ``` * 57f0448 - (HEAD -> master, origin/master, origin/HEAD) [lag_id] Add lagid to free_list when LC absent for 30 minutes (sonic-net#542) (9 hours ago) [Marty Y. Lok] * 3624cb7 - [stormond] Added new dynamic field 'last_sync_time' to STATE_DB (sonic-net#535) (21 hours ago) [Ashwin Srinivasan] * 0431fa3 - Addition of DPU Chassis for thermalctld (sonic-net#564) (24 hours ago) [Gagan Punathil Ellath] ``` #### How I did it #### How to verify it #### Description for the changelog
…et#535) <!-- Provide a general summary of your changes in the Title above --> #### Description <!-- Describe your changes in detail --> For some transceivers, the `MaxDurationDPInit` encoded in EEPROM might be lesser than the actual time taken to reach DP init state by the transceivers. This causes the CMIS initialization to fail with the below error ``` Jan 23 19:28:12.207053 DUT1 NOTICE pmon#xcvrd[36]: CMIS: Ethernet96: 400G, lanemask=0xff, state=DP_INIT, appl 1 host_lane_count 8 retries=0 Jan 23 19:28:12.221832 DUT1 NOTICE pmon#xcvrd[36]: CMIS: Ethernet96: DpInit duration 0.5 secs Jan 23 19:28:13.230593 DUT1 NOTICE pmon#xcvrd[36]: CMIS: Ethernet96: 400G, lanemask=0xff, state=DP_TXON, appl 1 host_lane_count 8 retries=0 Jan 23 19:28:13.234448 DUT1 NOTICE pmon#xcvrd[36]: CMIS: Ethernet96: timeout for 'DataPathInitialized' <<-- Error Jan 23 19:28:14.250133 DUT1 NOTICE pmon#xcvrd[36]: CMIS: Ethernet96: 400G, lanemask=0xff, state=INSERTED, appl 1 host_lane_count 8 retries=1 Jan 23 19:28:14.315148 DUT1 NOTICE pmon#xcvrd[36]: CMIS: Ethernet96: Setting appl=1 Jan 23 19:28:14.380452 DUT1 NOTICE pmon#xcvrd[36]: CMIS: Ethernet96: Setting host_lanemask=0xff Jan 23 19:28:14.511850 DUT1 NOTICE pmon#xcvrd[36]: CMIS: Ethernet96: Setting media_lanemask=0xf Jan 23 19:28:14.567454 DUT1 NOTICE pmon#xcvrd[36]: CMIS: Ethernet96: force Datapath reinit Jan 23 19:28:15.575785 DUT1 NOTICE pmon#xcvrd[36]: CMIS: Ethernet96: 400G, lanemask=0xff, state=DP_DEINIT, appl 1 host_lane_count 8 retries=1 Jan 23 19:28:15.598574 DUT1 NOTICE pmon#xcvrd[36]: CMIS: Ethernet96: DpDeinit duration 0.1 secs, modulePwrUp duration 10.0 secs Jan 23 19:28:16.606588 DUT1 NOTICE pmon#xcvrd[36]: CMIS: Ethernet96: 400G, lanemask=0xff, state=AP_CONFIGURED, appl 1 host_lane_count 8 retries=1 Jan 23 19:28:17.640929 DUT1 NOTICE pmon#xcvrd[36]: CMIS: Ethernet96: 400G, lanemask=0xff, state=DP_INIT, appl 1 host_lane_count 8 retries=1 Jan 23 19:28:17.656010 DUT1 NOTICE pmon#xcvrd[36]: CMIS: Ethernet96: DpInit duration 0.5 secs Jan 23 19:28:18.664305 DUT1 NOTICE pmon#xcvrd[36]: CMIS: Ethernet96: 400G, lanemask=0xff, state=DP_TXON, appl 1 host_lane_count 8 retries=1 Jan 23 19:28:18.668263 DUT1 NOTICE pmon#xcvrd[36]: CMIS: Ethernet96: timeout for 'DataPathInitialized' <<-- Error ``` Hence, we need to relax this stringent requirement from software to successfully allow CMIS initialization for such transceivers. #### Motivation and Context <!-- Why is this change required? What problem does it solve? If this pull request closes/resolves an open Issue, make sure you include the text "fixes #xxxx", "closes #xxxx" or "resolves #xxxx" here --> For transceivers with `MaxDurationDPInit` <= 1s, the CMIS driver will increase the timeout value to 10 times of the corresponding `MaxDurationDPInit` value advertised in the EEPROM to allow the transceiver to reach DP init state successfully. #### How Has This Been Tested? <!-- Please describe in detail how you tested your changes. Include details of your testing environment, and the tests you ran to see how your change affects other areas of the code, etc. --> For a transceiver with `MaxDurationDPInit` = 0.5s, the `get_datapath_init_duration` API returned a value of 5s. ``` PATELMI@DUT1:~$ sudo sfputil read-eeprom -p Ethernet96 -n 1 -o 144 -s 1 00000090 45 |E| PATELMI@DUT1:~$ . . . Jan 24 06:09:29.896666 DUT1 NOTICE pmon#xcvrd[101451]: CMIS: Ethernet96: 400G, lanemask=0xff, state=DP_INIT, appl 1 host_lane_count 8 retries=0 Jan 24 06:09:29.912270 DUT1 NOTICE pmon#xcvrd[101451]: CMIS: Ethernet96: DpInit duration 5.0 secs Jan 24 06:09:30.920209 DUT1 NOTICE pmon#xcvrd[101451]: CMIS: Ethernet96: 400G, lanemask=0xff, state=DP_TXON, appl 1 host_lane_count 8 retries=0 Jan 24 06:09:32.957135 DUT1 NOTICE pmon#xcvrd[101451]: message repeated 2 times: [ CMIS: Ethernet96: 400G, lanemask=0xff, state=DP_TXON, appl 1 host_lane_count 8 retries=0] Jan 24 06:09:32.957135 DUT1 NOTICE pmon#xcvrd[101451]: CMIS: Ethernet96: Turning ON tx power Jan 24 06:09:33.964916 DUT1 NOTICE pmon#xcvrd[101451]: CMIS: Ethernet96: 400G, lanemask=0xff, state=DP_ACTIVATION, appl 1 host_lane_count 8 retries=0 Jan 24 06:09:33.968630 DUT1 NOTICE pmon#xcvrd[101451]: CMIS: Ethernet96: READY ``` #### Additional Information (Optional) MSFT ADO - 31007979
rsyslog logs were being rotated regardless of whether they exceeded their maixmum size. This was due to "-f" flag passed to logrotate in cron job.
After rotation, /var/log/syslog was never written to again. Instead, logs were written to /var/log/syslog.1. This was due to rsyslog not properly closing the file descriptor to the pre-rotated log.
Also addresses issue systemd-journald[224]: [/etc/systemd/journald.conf:21] Failed to parse size value, ignoring: #528
Also brought back time-related rotation via the new(er)
maxsize
option, which performs a boolean OR operation. If the log exceeds the maxsize OR the log hasn't been rotated in the specified, it will be rotated. Using the oldersize
option, the time-based rotation was ignored.