-
Notifications
You must be signed in to change notification settings - Fork 159
Overlay redesign #1097
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Overlay redesign #1097
Conversation
Thank you for contributing to the Leapp project!Please note that every PR needs to comply with the Leapp Guidelines and must pass all tests in order to be mergeable.
Packit will automatically schedule regression tests for this PR's build and latest upstream leapp build. If you need a different version of leapp from PR#42, use To launch regression testing public members of oamg organization can leave the following comment:
Please open ticket in case you experience technical problem with the CI. (RH internal only) Note: In case there are problems with tests not being triggered automatically on new PR/commit or pending for a long time, please contact leapp-infra. |
c1a2851
to
13f682b
Compare
|
||
Note: the partition hosting the scratch dir is expected to be the same | ||
partition that is hosting the target userspace container, but it does not | ||
have to be true if the code changes. Right now, let's live with that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just out of curiosity, do we have any plans for the code changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question is not whether yes or no. Important is to be aware that we have some functionality implemented in the way that if a change happens, this will be broken with the current implementation. No, we do not plan changes here. But who knows in years?
with the missing space. | ||
""" | ||
|
||
_MAGICAL_CONSTANT_MIN_CONTAINER_SIZE = 2200 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that we probably do not want to get back to affecting the size numbers like in the "LEAPP_OVL_SIZE" but could not some way to affect the numbers help in some edge cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my checks, this seems really like the reasonable minimum. Better questions is whether we could allow people to set higher value on their own. I am thinking about that. But not sure how good the idea is. Imagine how to document that variable in the documentation for users without explanation what we are doing - as it would need a documentation :/ anyway, the change is possible and simple.
Also, I am thinking about the testing - if we deliver the possibility, maybe it will be problematic from the testing POV. not sure sure. Opened for the discussion: @oamg/developers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add env vars that would allow modifying this particular constant, but I would argue not to put it into documentation anywhere. If we gather empirical data that there are people that are hitting this problem, then we can put it into documentation and deal with the testing situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the size has been changed to 3200 for now. most likely it will be lowered at least for IPU 8 -> 9. Just for now, keepin like that to stay safe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
used different values for each RHEL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about the problem at hand, I've come to the conclusion the overall approach taken in this PR is legitimate and logical. The only part that might be interesting to discuss is the heuristic - it makes me wonder how well predictable is the container size. It would be super exciting to come up with a model that would be able to predict target userspace size dynamically based on the source system.
8ded629
to
bcfdd4e
Compare
So I've discovered that my previous information was wrong. It seems that the minimal container size is
additionally I've discovered confusing errrors when trying to install rpms inside the container to be able to generate initramfs as such operation could end with information about missing disk space on |
5f6d74a
to
221a552
Compare
a0c96b0
to
1b35e6c
Compare
Current space consumption and checks of the free space tested several times close to what I call close to minimal reasonable free space on /var - ending with 945MB of the free space for the upgrade - which is nearly boundary of the minimal required free space to be able to perform the upgrade transaction yet. |
The estimated required disk space is optimized for IPU 8 -> 9 (reduced by 1GiB - based on experiments). |
e6776c8
to
dc0c18d
Compare
def _handle_transaction_err_msg_size(err): | ||
if get_env('LEAPP_OVL_LEGACY', '0') == '1': | ||
_handle_transaction_err_msg_old(err) | ||
return # not needed actually as the above function raises error, but for visibility | ||
NO_SPACE_STR = 'more space needed on the' | ||
|
||
# Disk Requirements: | ||
# At least <size> more space needed on the <path> filesystem. | ||
# | ||
missing_space = [line.strip() for line in err.stderr.split('\n') if NO_SPACE_STR in line] | ||
size_str = re.match(r'At least (.*) more space needed', missing_space[0]).group(1) | ||
message = 'There is not enough space on the file system hosting /var/lib/leapp.' | ||
hint = ( | ||
'Increase the free space on the filesystem hosting' | ||
' /var/lib/leapp by {} at minimum. It is suggested to provide' | ||
' reasonably more space to be able to perform all planned actions' | ||
' (e.g. when 200MB is missing, add 1700MB or more).\n\n' | ||
'It is also a good practice to create dedicated partition for' | ||
' for /var/lib/leapp when more space is needed, which can be' | ||
' dropped after the system upgrade is fully completed' | ||
' For more info, see: {}' | ||
.format(size_str, DEDICATED_URL) | ||
) | ||
# we do not want to confuse customers by the orig msg speaking about | ||
# missing space on '/'. Skip the Disk Requirements section. | ||
# The information is part of the hint. | ||
details = {'hint': hint} | ||
|
||
raise StopActorExecutionError(message=message, details=details) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure whether this is actually correct. I am afraid that I maybe missed a corner case here. Imagine this partitioning:
# mount point: free space
/: 10 GB
/var: 700MB (e.g.)
/var/lib/leapp: 10 GB
In this case, the disk img for /var is located under the leapp dir, but it's apparent size will be only ~ 665MB. Currently we need to download ~160M data (metadata + packages). And some space is consumed running dnf/rpm. This all is stored inside the OVL layer. Imagine we need to download more pkgs (/var cannot have less than 500M otherwise it crashes during the scan phase when we call yum :)), in this only one situation, I think we will have problem with the msg, as it will be actually affected by the size of /var and not /var/lib/leapp. But who would actually try to upgrade when having only such small space on /var ??? So it's from my POV really total corner case that should not happen. Also it needs to be differenciated, whether the problem is caused during pkgs download or calculation whether there is enough disk space for the installation of the downloaded packages (both cases has different answer where the space is missing).
Opposite problem could appear if /var/lib/leapp is not big enough and we consume for the pkg download all remaining free space. But regarding the implemented checks, I do not believe this situation will happen (it would mean that someone would like to download couple of GBs pkgs to install super big container right now. and that does not seem meaningful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@oamg/developers ^ ping. do not miss this comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only scenario that I can image this to happen is that someone has small /var. The first leapp execution crashes, lacking required disk space. The user then deals with this problem by creating a separate /var/lib/leapp partition without increasing the size of /var. The wording of this error message should not create the impression that a separate /var/lib/leapp will not deal with the case when the /var partition is too small.
I've also checked the internet for recommended partitioning. RHEL8 partitioning guide recommends that 5gb
is enough for /var
. RHEL7 guide does not mention a precise number, but says it should be "sufficiently large", whatever that means. Other advice on the internet states that /var
should be at least 2gb in size.
I've also went trough vanilla RHEL images on Azure and they all have /var
5gb+ in size (I remember encountering some issues around /var space there before).
Therefore, I do not think that this corner case will be encountered.
0fea93b
to
1246660
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I will approve when I try it out manually.
@@ -149,13 +150,84 @@ def _update_files(copy_files): | |||
_copy_files(context, files) | |||
|
|||
|
|||
def _get_fspace(path, convert_to_mibs=False, coefficient=1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like a simple multiplication being hidden in a function. If something needs the free space but slightly warped with some coefficient then the code should do the multiplication itself. Free space is free space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also the convert_to_mibs
argument is kindof pointless, or? The only time the function is called without convert_to_mibs=True is when the other number that is being compared against the free space is converted from mbs to bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in this case, let's rather add TODO and discuss it later - but before the upstream release
It is not expected to hit this issue, but I was successful and I know | ||
it's still possible even with all other changes (just it's much harder | ||
now to hit it). So adding this seatbelt, that is not 100% bulletproof, | ||
but I call it good enough. | ||
|
||
Currently protecting last 500MB. In case of problems, we can increase | ||
the value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like saving additional context of the code in the comments, but I am not sure how to feel about putting it in documentation strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point. let's discuss it later - but before the release. we can still change it before the mid Aug.
repos/system_upgrade/common/actors/targetuserspacecreator/libraries/userspacegen.py
Outdated
Show resolved
Hide resolved
...ade/common/actors/initramfs/upgradeinitramfsgenerator/libraries/upgradeinitramfsgenerator.py
Outdated
Show resolved
Hide resolved
...ade/common/actors/initramfs/upgradeinitramfsgenerator/libraries/upgradeinitramfsgenerator.py
Outdated
Show resolved
Hide resolved
I've performed a 7>8 upgrade using internal vagrant box, everything went fine. |
… cache If leapp is executed with LEAPP_DEVEL_USE_PERSISTENT_PACKAGE_CACHE=1, the /var/dnf/cache from the target container has been copied under /var/lib/leapp/persistent_package_cache The negative effect was that it took too much space on the disk (800+ MBs, depends on how much rpms have been downloaded before..) which could lead easily to the consumed disk space on related partition, which eventually could stop also the leapp execution as it cannot do any meaningful operations when the disk is full (e.g. access the database). This is done now without nspawn context functions as the move operation does not make so much sense to be implemented as it's more expected to copy to/from the container than moving files/dirs.
We are going to redesign the use of overlay images during the upgrade to resolve number of issues we have with the old solution. However, we need to keep the old solution as a fallback (read below). This is small preparation to keep the new and old code separated safely. Reasoning for the fallback: * There is a chance the new solution could raise also some problems mainly for systems with many partitions/volumes in fstab, or when they are using many loop devices already - as the new solution will require to create loop device for each partition/volume noted in the fstab. * Also RHEL 7 is going to switch to ELS on Jun 2024 after which the project will be fixing just critical bugfixes for in-place upgrades. This problem blocking the upgrade is not considered to be critical.
All LEAPP_* envars are supposed to be read by library function which ensures persistent behaviour during the whole upgrade process.
The in-place upgrade itself requires to do some changes on the system to be able to perform the in-place upgrade itself - or even to be able to evaluate if the system is possible to upgrade. However, we do not want to (and must not) change the original system until we pass beyond the point of not return. For that purposes we have to create a layer above the real host file system, where we can safely perform all operations without affecting the system setup, rpm database, etc. Currently overlay (OVL) technology showed it is capable to handle our requirements good enough - with some limitations. However, the original design we used to compose overlay layer above the host system had number of problems: * buggy calculation of the required free space for the upgrade RPM transaction * consumed too much space to handle partitions formatted with XFS without ftype attributes (even tens GBs) * bad UX as people had to manually adjust size of OVL disk images * .. and couple of additional issues derivated from problems listed above The new solution prepares a disk image (represented by sparse-file) and an overlay image for each mountpoint configured in /etc/fstab, excluding those with FS types noted in the `OVERLAY_DO_NOT_MOUNT` set. Such prepared OVL images are then composed together to reflect the real host filesystem. In the end everything is cleaned. The composition could look like this: orig mountpoint -> disk img -> overlay img -> new mountpoint ------------------------------------------------------------- / -> root_ -> root_/ovl -> root_/ovl/ /boot -> root_boot -> root_boot/ovl -> root_/ovl/boot /var -> root_var -> root_var/ovl -> root_/ovl/var /var/lib -> root_var_lib -> root_var_lib/ovl -> root_/ovl/var/lib ... The new solution can be now problematic for system with too many partitions and loop devices, as each disk image is loop mounted (that's same as before, but number of disk images will be bigger in total number). For such systems we keep for now the possibility of the fallback to an old solution, which has number of issues mentioned above, but it's a trade of. To fallback to the old solution, set envar: LEAPP_OVL_LEGACY=1 Disk images created for OVL are formatted with XFS by default. In case of problems, it's possible to switch to Ext4 FS using: LEAPP_OVL_IMG_FS_EXT4=1 XFS is better optimized for our use cases (faster initialisation consuming less space). However we have reported several issues related to overlay images, that happened so far only on XFS filesystems. We are not sure about root causes, but having the possibility to switch to Ext4 seems to be wise. In case of issues, we can simple ask users to try the switch and see if the problem is fixed or still present. Some additional technical details about other changes * Added simple/naive checks whether the system has enough space on the partition hosting /var/lib/leapp (usually /var). Consuming the all space on the partition could lead to unwanted behaviour - in the worst case if we speak about /var partition it could mean problems also for other applications running on the system * In case the container is larger than the expected min default or the calculation of the required free space is lower than the minimal protected size, return the protected size constant (200 MiB). * Work just with mountpoints (paths) in the _prepare_required_mounts() instead of with list of MountPoint named tuple. I think about the removal of the named tuple, but let's keep it for now. * Make apparent size of created disk images 5% smaller to protect failed upgrades during the transaction execution due to really small amount of free space. * Cleanup the scratch directory at the end to free the consumed space. Disks are kept after the run of leapp when LEAPP_DEVEL_KEEP_DISK_IMGS=1
With the redesigned overlay solution, original error messages are misleading. Keeping original error msgs when LEAPP_OVL_LEGACY=1. Also handle better the error msgs generated when installing initramfs dependencies. In case of the missing space the error has been unhandled. Now it is handled with the correct msg also.
…generation Under rare conditions it's possible the last piece free space is consumed when the upgrade initramfs is generated. It's hard to hit this problems right now without additional customisations that consume more space than we expect. However, when it happens, it not good situation. From this point, check the remaining free space on the FS hosting the container. In case we have less than 500MB, do not even try. Possibly we will increase the value in future, but consider it good enough for now.
Regarding the changes around source OVL, we need to update the error msg properly in case the installation of the target userspace container fails. In this case, we want to change only the part when the upgrade fails due to missing space.
1246660
to
f991107
Compare
@MichalHe @Rezney @matejmatuska thanks you guys for the review. let's merge it now and resolve the missing stuff in the following period. |
Merging regarding the time pressure and bunch of manual tests that have been done. Unless we speak about corner cases and additional possible optimisations, it should be most working (shame on me I haven't delivered this earlier for the proper testing prior the merge) |
See related leapp-repository PRs: * oamg/leapp-repository#1093 * oamg/leapp-repository#1097 * oamg/leapp-repository#1107
## Packaging - Requires leapp-framework 5.0 ## Upgrade handling ### Fixes - Add el8toel9 actor to handle directory -> symlink with ruby IRB. (oamg#1076) - Do not try to update GRUB core on IBM Z systems (oamg#1117) - Fix failing upgrades with devtmpfs file systems specified in FSTAB (oamg#1090) - Fix the calculation of the required free space on each partitions/volume for the upgrade transactions (oamg#1097) - Fix the generation of the report about hybrid images (oamg#1064) - Handle correctly the installed certificates to allow upgrades with custom repositories using HTTPs with enabled SSL verification (oamg#1106) - Minor improvements and fixes of various reports (oamg#1066, oamg#1067, oamg#1085) - Update error messages about leapp data files to inform user how to obtain valid data files (oamg#1121) - Update links in various reports (oamg#1062, oamg#1086) - Update the repomap data to cover changed repoids in RHUI Azure (oamg#1087) - [IPU 7 -> 8] Fix false positive report about invalid symlinks on RHEL 7 (oamg#1052) - [IPU 8 -> 9] Inhibit the upgrade when unsupported x86-64 microarchitecture is detected (oamg#1059) ### Enhancements - Include updated leapp data files in the RPM (oamg#1046, oamg#1092, oamg#1119) - Update the set of supported upgrade paths (oamg#1077): - RHEL with SAP HANA 7.9 -> 8.6, 8.8 (default: 8.6) - RHEL with SAP HANA 8.8 -> 9.2 - Introduce new upgrade paths: - RHEL 7.9 -> 8.9 (default) - RHEL 8.9 -> 9.3 - Correctly update grub2 when /boot resides on multiple devices aggregated in RAID (oamg#1093, oamg#1115) - Enable upgrades for machines using RHUI on AlibabaCloud (oamg#1088) - Introduce possibility to add kernel drivers to initramfs (oamg#1081) - Redesign handling of information about kernel (booted and target) in preparation for new changes in RHEL 9 (oamg#1107) - Redesign source system overlay to use disk images backed by sparse files to optimize disk space consumption (oamg#1097, oamg#1103) - Requires leapp-framework 5.0 (oamg#1061, oamg#1116) - Use new leapp CLI API which provides better report summary output (oamg#1061, oamg#1116) - [IPU 8 -> 9] Detect and report use of deprecated Xorg drivers (oamg#1078) - [IPU 8 -> 9] Introduce IPU for systems with FIPS enabled (oamg#1053) ## Additional changes interesting for devels - Deprecated `GrubInfo.orig_device_name` field in the `GrubInfo` model (replaced by `GrubInfo.orig_devices`) (oamg#1093) - Deprecated `InstalledTargetKernelVersion` model (replaced by `InstalledTargetKernelInfo`) (oamg#1107) - Deprecated `leapp.libraries.common.config.version.is_rhel_realtime` (check the type in msg `KernelInfo`, field `type`) (oamg#1107) - Deprecated `leapp.libraries.common.grub.get_grub_device()` (replaced by `leapp.libraries.common.grub.get_grub_devices()`) (oamg#1093) - Introduced new devel envar LEAPP_DEVEL_KEEP_DISK_IMGS=1 to skip the removal of the created disk images for OVL. That's sometimes handy for the debugging. (oamg#1097)
See related leapp-repository PRs: * oamg/leapp-repository#1093 * oamg/leapp-repository#1097 * oamg/leapp-repository#1107
See related leapp-repository PRs: * oamg/leapp-repository#1093 * oamg/leapp-repository#1097 * oamg/leapp-repository#1107
## Packaging - Requires leapp-framework 5.0 ## Upgrade handling ### Fixes - Add el8toel9 actor to handle directory -> symlink with ruby IRB. (#1076) - Do not try to update GRUB core on IBM Z systems (#1117) - Fix failing upgrades with devtmpfs file systems specified in FSTAB (#1090) - Fix the calculation of the required free space on each partitions/volume for the upgrade transactions (#1097) - Fix the generation of the report about hybrid images (#1064) - Handle correctly the installed certificates to allow upgrades with custom repositories using HTTPs with enabled SSL verification (#1106) - Minor improvements and fixes of various reports (#1066, #1067, #1085) - Update error messages about leapp data files to inform user how to obtain valid data files (#1121) - Update links in various reports (#1062, #1086) - Update the repomap data to cover changed repoids in RHUI Azure (#1087) - [IPU 7 -> 8] Fix false positive report about invalid symlinks on RHEL 7 (#1052) - [IPU 8 -> 9] Inhibit the upgrade when unsupported x86-64 microarchitecture is detected (#1059) ### Enhancements - Include updated leapp data files in the RPM (#1046, #1092, #1119) - Update the set of supported upgrade paths (#1077): - RHEL with SAP HANA 7.9 -> 8.6, 8.8 (default: 8.6) - RHEL with SAP HANA 8.8 -> 9.2 - Introduce new upgrade paths: - RHEL 7.9 -> 8.9 (default) - RHEL 8.9 -> 9.3 - Correctly update grub2 when /boot resides on multiple devices aggregated in RAID (#1093, #1115) - Enable upgrades for machines using RHUI on AlibabaCloud (#1088) - Introduce possibility to add kernel drivers to initramfs (#1081) - Redesign handling of information about kernel (booted and target) in preparation for new changes in RHEL 9 (#1107) - Redesign source system overlay to use disk images backed by sparse files to optimize disk space consumption (#1097, #1103) - Requires leapp-framework 5.0 (#1061, #1116) - Use new leapp CLI API which provides better report summary output (#1061, #1116) - [IPU 8 -> 9] Detect and report use of deprecated Xorg drivers (#1078) - [IPU 8 -> 9] Introduce IPU for systems with FIPS enabled (#1053) ## Additional changes interesting for devels - Deprecated `GrubInfo.orig_device_name` field in the `GrubInfo` model (replaced by `GrubInfo.orig_devices`) (#1093) - Deprecated `InstalledTargetKernelVersion` model (replaced by `InstalledTargetKernelInfo`) (#1107) - Deprecated `leapp.libraries.common.config.version.is_rhel_realtime` (check the type in msg `KernelInfo`, field `type`) (#1107) - Deprecated `leapp.libraries.common.grub.get_grub_device()` (replaced by `leapp.libraries.common.grub.get_grub_devices()`) (#1093) - Introduced new devel envar LEAPP_DEVEL_KEEP_DISK_IMGS=1 to skip the removal of the created disk images for OVL. That's sometimes handy for the debugging. (#1097)
The in-place upgrade itself requires to do some changes on the system to be able to perform the in-place upgrade itself - or even to be able to evaluate if the system is possible to upgrade. However, we do not want to (and must not) change the original system until we pass beyond the point of not return.
For that purposes we have to create a layer above the real host file system, where we can safely perform all operations without affecting the system setup, rpm database, etc. Currently overlay (OVL) technology showed it is capable to handle our requirements good enough - with some limitations.
However, the original design we used to compose overlay layer above the host system had number of problems:
The new solution prepares a disk image (represented by sparse-file) and an overlay image for each mountpoint configured in /etc/fstab, excluding those with FS types noted in the
OVERLAY_DO_NOT_MOUNT
set. Such prepared OVL images are then composed together to reflect the real host filesystem. In the end everything is cleaned.The composition could look like this:
The new solution can be now problematic for system with too many partitions and loop devices, as each disk image is loop mounted (that's same as before, but number of disk images will be bigger in total number).
For such systems we keep for now the possibility of the fallback to an old solution, which has number of issues mentioned above, but it's a trade of. To fallback to the old solution, set envar:
Disk images created for OVL are formatted with XFS by default. In case of problems, it's possible to switch to Ext4 FS using:
XFS is better optimized for our use cases (faster initialisation consuming less space). However we have reported several issues related to overlay images, that happened so far only on XFS filesystems. We are not sure about root causes, but having the possibility to switch to Ext4 seems to be wise. In case of issues, we can simple ask users to try the switch and see if the problem is fixed or still present.
Related (covered) BZs:
Main JIRA ticket: OAMG-8615
Still missing
Various outputs visible
Note: texts could be changed in the final version. I am posting outputs continuously as it will be easier to understand all changes and we will have at least some output giving a hint it actually works (some of these scenarious could be hard to hit without without changes of magical constants, just showind these cases too for better visibility).