-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Filesystems - eMMC wearout mitigation #2774
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Some platform's flash is eMMC, this can be ok for some NOS but SONiC is very write intensive, especially logging. This patch aims to prolong the life of these eMMC flash drives by moving the /var/log filesystem from a loop device to a tmpfs RAMdrive The patch checks the kernel boot variables for the presence of an MMC module string and modifies the initramfs build to mount the /var/log filesystem as a ramdisk. The check is crude and has only been tested on our only known affected platform, we are not sure it may work on all eMMC based flash storage. Caveat: you lose all stored logs at each reboot on eMMC platforms so configure your rsyslogs to log outside the box Signed-off-by: Michel Moriniaux <[email protected]>
## Mount loop device for /var/log | ||
[ -f ${rootmnt}/host/disk-img/var-log.ext4 ] && mount -t ext4 -o loop,rw ${rootmnt}/host/disk-img/var-log.ext4 ${rootmnt}/var/log | ||
## Mount loop device for /var/log or a tmpfs device if the flash is eMMC | ||
if [ -z ${eMMC+1} ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of the "+1" here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originaly left it out but got pointed to https://stackoverflow.com/questions/3601515/how-to-check-if-a-variable-is-set-in-bash where it is stated that its a safer check
@@ -17,6 +17,9 @@ for x in "$@"; do | |||
case "$x" in | |||
varlog_size=*) | |||
varlog_size="${x#varlog_size=}" | |||
;; | |||
*mmc_host*) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this change depends on a "mmc_host" kernel command line parameter. I agree with your description that this is a crude check. Is there a way for a vendor to override this for their platform (or for an end-user building an image, for that matter) in the case it is not desired? What is populating the "mmc_host" kernel command line parameter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On my device the actual kernel command line parameter is:
block_flash=pci0000:00/0000:00:14.7/mmc_host/.*$
instead of looking for block_flash I'm looking for an actual MMC discriminator hence the *mmc_host*
again this may need more research to find a robust way of detecting that the image depends on flash and its not just a usb key with a eMMC component that is inserted and not used
Of all the platforms I have in my lab ( 6+ ) this is the only one I have found this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as for your question on overriding: maybe introduce a rules/config variable and check on that?
our vendor's recommendation for this change makes it clear that there is a real danger for the longevity of the switch if the default behavior is left as-is
Some platform's flash is eMMC, this can be ok for some NOS but SONiC is very write intensive, especially logging. This patch aims to prolong the life of these eMMC flash drives by moving the /var/log filesystem from a loop device to a tmpfs RAMdrive The patch checks the kernel boot variables for the presence of an MMC module string and modifies the initramfs build to mount the /var/log filesystem as a ramdisk. The check is crude and has only been tested on our only known affected platform, we are not sure it may work on all eMMC based flash storage. Caveat: you lose all stored logs at each reboot on eMMC platforms so configure your rsyslogs to log outside the box Signed-off-by: Michel Moriniaux <[email protected]>
@MichelMoriniaux Your thoughts and solution are excellent. I am very sorry to ask you hold your change and wait for one from us then put your addition on it. For some reason that is no longer valid today, we had to keep a change internal to Microsoft in the past. Which essentially did the same thing you did, but with a slightly different approach. We decide whether or not to mount tmpfs /var/log partition with what size base on the device platform string. We did see the benefit of your change, which makes all platforms with certain hard drive take the same approach. It is simple and elegant. However, there is a good reason to take the platform string approach. While we were working on this issue, we discovered that different platforms, when reaching initramfs stage, have different amount of RAM available for mounting tmpfs. Some platform can only mount a very small /var/log partition through this stage. Because of this reason, we decide the size of /var/log partition base on platform string. Since we no longer have the reason to keep our similar change internal, and sorry for being lazy not upstream-ing it earlier. We will upstream the tmpfs /var/log partition change soon-ish, and please add the platforms you would like to mount tmpfs /var/log to the list. Thanks so much for contributing to SONiC, and apologize for the inconvenience, |
@yxieca thank you very much for your comment, indeed my internal build uses the platform string and I decided to make it more generic ( anonymized really ) for I'm pretty sure the same reasons as you did: to as not to expose a particular vendor's implementation to the public at large. |
@MichelMoriniaux Please share your thoughts on #2780. Thanks & Regards, |
as per discussion in sonic-net#2780 and sonic-net#2774 added the 7060 platform to the list of hosts Signed-off-by: Michel Moriniaux <[email protected]>
Some platform's flash is eMMC, this can be ok for some NOS but SONiC is
very write intensive, especially logging. This patch aims to
prolong the life of these eMMC flash drives by moving the /var/log
filesystem from a loop device to a tmpfs RAMdrive
The patch checks the kernel boot variables for the presence of an MMC
module string and modifies the initramfs build to mount the /var/log
filesystem as a ramdisk. The check is crude and has only been tested on
our only known affected platform, we are not sure it may work on all
eMMC based flash storage.
Caveat: you lose all stored logs at each reboot on eMMC platforms so
configure your rsyslogs to log outside the box
Signed-off-by: Michel Moriniaux [email protected]