-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Changes to support bcmsh and swss logs on multi npu platforms #4783
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 28 commits
6a4d512
ea3607f
8db1174
c660fc9
5f16e96
ef994a1
22bf545
ba234ab
6e1ae35
ed7fafc
2187144
243268f
e66cb47
2e0aa4f
b72fed7
cae65a4
a4253af
5f31842
c4b5b00
eebca91
4240c8c
0a260b7
90dcbe1
405033b
20698aa
e62b5f6
969a914
eb90fa0
afa4f1a
32a34f9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
../x86_64-arista_common/pmon_daemon_control_skip_thermalctld.json |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
../x86_64-arista_common/pmon_daemon_control_skip_thermalctld.json |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
../x86_64-arista_common/pmon_daemon_control_skip_thermalctld.json |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
../x86_64-arista_common/pmon_daemon_control_skip_thermalctld.json |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
../x86_64-arista_common/pmon_daemon_control_skip_thermalctld.json |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
../x86_64-arista_common/pmon_daemon_control_skip_thermalctld.json |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
../x86_64-arista_common/pmon_daemon_control_skip_thermalctld.json |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
../x86_64-arista_common/pmon_daemon_control_skip_thermalctld.json |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
../x86_64-arista_common/pmon_daemon_control_skip_thermalctld.json |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
../x86_64-arista_common/pmon_daemon_control_skip_thermalctld.json |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
../x86_64-arista_common/pmon_daemon_control_skip_thermalctld.json |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
{ | ||
"skip_thermalctld": true | ||
} | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,110 @@ | ||
# These logs should no longer get created. However, in case they do get created, | ||
# we should keep them to a small size and rotate them also. | ||
/var/log/mail.info | ||
/var/log/mail.warn | ||
/var/log/mail.err | ||
/var/log/mail.log | ||
/var/log/daemon.log | ||
/var/log/kern.log | ||
/var/log/user.log | ||
/var/log/lpr.log | ||
/var/log/debug | ||
/var/log/messages | ||
{ | ||
size 10k | ||
rotate 1 | ||
missingok | ||
notifempty | ||
compress | ||
delaycompress | ||
sharedscripts | ||
postrotate | ||
/bin/kill -HUP $(cat /var/run/rsyslogd.pid) | ||
endscript | ||
} | ||
|
||
/var/log/auth.log | ||
/var/log/cron.log | ||
/var/log/syslog | ||
/var/log/teamd.log | ||
/var/log/telemetry.log | ||
/var/log/quagga/bgpd.log | ||
/var/log/quagga/zebra.log | ||
{% if namespaces > 1 %} | ||
{% for ns in range(namespaces) %} | ||
/var/log/swss{{ns}}/sairedis.rec | ||
/var/log/swss{{ns}}/swss.rec | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am curious about other logs such as quagga, teamd? they are also running in the each namespace? are we aggregating all of them into one log file? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i am not fan using templating everywhere. I think a better solution here is to add option to both swss and sairedis to allow specify the log file name, and we can use swss.asic{n}.rec, sairedis.asic{n}.rec. then, we do not need to use template for rsyslog file. In the future, if we need to dynamically incrase the asic number of the platform, there is no need to change this file. |
||
{% endfor %} | ||
{% else %} | ||
/var/log/swss/sairedis.rec | ||
/var/log/swss/swss.rec | ||
{% endif %} | ||
{ | ||
size 1M | ||
rotate 5000 | ||
missingok | ||
notifempty | ||
compress | ||
delaycompress | ||
nosharedscripts | ||
firstaction | ||
# Adjust NUM_LOGS_TO_ROTATE to reflect number of log files that trigger this block specified above | ||
NUM_LOGS_TO_ROTATE=8 | ||
|
||
# Adjust LOG_FILE_ROTATE_SIZE_KB to reflect the "size" parameter specified above, in kB | ||
LOG_FILE_ROTATE_SIZE_KB=1024 | ||
|
||
# Reserve space for btmp, wtmp, dpkg.log, monit.log, etc., as well as logs that | ||
# should be disabled, just in case they get created and rotated | ||
RESERVED_SPACE_KB=4096 | ||
|
||
VAR_LOG_SIZE_KB=$(df -k /var/log | sed -n 2p | awk '{ print $2 }') | ||
|
||
# Limit usable space to 90% of the partition minus the reserved space for other logs | ||
USABLE_SPACE_KB=$(( (VAR_LOG_SIZE_KB * 90 / 100) - RESERVED_SPACE_KB)) | ||
|
||
# Set our threshold so as to maintain enough space to write all logs from empty to full | ||
# Most likely, some logs will have non-zero size when this is called, so this errs on the side | ||
# of caution, giving us a bit of a cushion if a log grows quickly and passes its rotation size | ||
THRESHOLD_KB=$((USABLE_SPACE_KB - (NUM_LOGS_TO_ROTATE * LOG_FILE_ROTATE_SIZE_KB * 2))) | ||
|
||
# First, delete any *.1.gz files that might be left around from a prior incomplete | ||
# logrotate execution, otherwise logrotate will fail to do its job | ||
find /var/log/ -name '*.1.gz' -type f -exec rm -f {} + | ||
|
||
while true; do | ||
USED_KB=$(du -s /var/log | awk '{ print $1; }') | ||
|
||
if [ $USED_KB -lt $THRESHOLD_KB ]; then | ||
break | ||
else | ||
OLDEST_ARCHIVE_FILE=$(find /var/log -type f -printf '%T+ %p\n' | grep -E '.+\.[0-9]+(\.gz)?$' | sort | head -n 1 | awk '{ print $2; }') | ||
|
||
if [ -z "$OLDEST_ARCHIVE_FILE" ]; then | ||
logger -p syslog.err -t "logrotate" "No archive file to delete -- potential for filling up /var/log partition!" | ||
break | ||
fi | ||
|
||
logger -p syslog.info -t "logrotate" "Deleting archive file $OLDEST_ARCHIVE_FILE to free up space" | ||
rm -rf "$OLDEST_ARCHIVE_FILE" | ||
fi | ||
done | ||
endscript | ||
postrotate | ||
{% if namespaces > 1 %} | ||
{% for ns in range(namespaces) %} | ||
if [ $(echo $1 | grep -c "/var/log/swss{{ns}}/") -gt 0 ]; then | ||
pgrep -x orchagent | xargs /bin/kill -HUP 2>/dev/null || true | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. will this kill all orchagent instances? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. you probably do not need to do template here since, logrotate will pass the absolute path to the log file to this postrotate script, and from that absoluate path, you will know whethere it is multiasic or not, and which asic id it is if it is multi-asic. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. maybe you should get into the container and do the pkill |
||
else | ||
/bin/kill -HUP $(cat /var/run/rsyslogd.pid) | ||
fi | ||
{% endfor %} | ||
{% else %} | ||
if [ $(echo $1 | grep -c "/var/log/swss/") -gt 0 ]; then | ||
pgrep -x orchagent | xargs /bin/kill -HUP 2>/dev/null || true | ||
else | ||
/bin/kill -HUP $(cat /var/run/rsyslogd.pid) | ||
fi | ||
{% endif %} | ||
endscript | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we know if logrotate works if changed the place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added changes for logrotation in the latest commit