Skip to content

Split up library builds into individual builder stages to preserve layer cache #343

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

lightswitch05
Copy link

Why

Hello! I've noticed the docker image is quite hefty, and I was curious if I could improve it a bit. After taking a look, I realized I couldn't help much with reducing the image size 😆 .... but I thought it might be possible to improve layer reuse between builds. In the end, this feature branch is only 7.98mb smaller then whats on master, but I believe layer reuse is now a possibility depending on how the builds and caching are set up.

If no one thinks this PR provides any value, that’s no problem! It does introduce a bit more complexity, so I totally understand. Anyway, on to the changes I made:

What

I've moved each major build phase into its own builder stage using multi-stage builds: ssocr, pip, libcec, PicoTTS, and Telldus. The results of those builder stages are then copied out into the 'main' stage. All temporary files were already being pruned nicely, so again, no real space savings. However, using the COPY --link command from the builder stages enables this cool docker feature:

Use --link to reuse already built layers in subsequent builds with --cache-from even if the previous layers have changed. This is especially important for multi-stage builds where a COPY --from statement would previously get invalidated if any previous commands in the same stage changed, causing the need to rebuild the intermediate stages again. With --link the layer the previous build generated is reused and merged on top of the new layers. This also means you can easily rebase your images when the base images receive updates, without having to execute the whole build again.

So, if you need to bump a version in requirements.txt, using COPY --link will allow those other layer - like ssocr - to remain unchanged. Pretty cool! If this PR works as expected, I hope that the next time I run docker compose pull, it will require fewer layers to be pulled.

Now... there is a bit of a gotcha with all this. This caching logic only works if the builds are correctly set up with caching. For example, docker-compose builds cannot create a multi-stage build cache. Looking around, I see buildx is being used over at home-assistant/builder/, but there was a lot of logic going on, and I couldn't quite follow it all.

So, there's a chance some follow-up changes might be needed before the benefits of this PR can be realized - for example, using cache-to and ensuring mode=max is set to enable the mutli-stage build cache. But one step at a time - if you all think this is an improvement worth making, we can iterate from here.

Testing

For testing, I ran the build and verified that it runs. However, that doesn’t fully confirm that the libraries I modified are still being installed correctly. Some follow-up work is definitely required to verify everything is functioning as expected.

Copy link

@home-assistant home-assistant bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @lightswitch05

It seems you haven't yet signed a CLA. Please do so here.

Once you do that we will be able to review and accept this pull request.

Thanks!

@home-assistant
Copy link

Please take a look at the requested changes, and use the Ready for review button when you are done, thanks 👍

Learn more about our pull request process.

@home-assistant home-assistant bot marked this pull request as draft February 24, 2025 05:07
@lightswitch05 lightswitch05 marked this pull request as ready for review February 24, 2025 05:07
WORKDIR /tmp/
COPY patches/libcec-fix-null-return.patch /tmp/
COPY patches/libcec-python313.patch /tmp/
RUN --mount=type=cache,target=/etc/apk/cache,sharing=locked,id=libcec-builder-${BUILD_FROM}-${LIBCEC_VERSION} \
Copy link
Author

@lightswitch05 lightswitch05 Feb 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... looks like I might have used to wrong cache directory... based on the docs it seems like /etc/apk/cache is what things write to, but its just a simlink to /var/cache/apk. I can just add a cache mount for each directory and call it a day.

I'm also thinking it might be fine to re-use that cache between each builder instead of having unique cache for each one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant