Skip to content

(maint) Fix multiple architecture Docker builds #1286

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 31, 2022

Conversation

Iristyle
Copy link
Contributor

@Iristyle Iristyle commented Mar 29, 2022

  • Using different tags for different architectures is wrong. Tags
    in Docker support multiple architectures.

    Using one buildx command for multiple architectures is entirely
    supported by docker.

    Note - this also fixes build output in TravisCI by ensuring
    that ARM builds use --progress plain.

  • By default, so as to not break developer workflows, the build target
    will only produce a container for linux/amd64 architecture and will
    output to the docker daemon (aka --load). When calling the
    push-image target, multiple architectures will be built and the
    output will be pushed to the docker registry (i.e. docker hub)
    instead.

    buildx doesn't support loading multiple architectures into the
    local daemon as described in:

    Building images for multi-arch with --load parameter fails docker/buildx#59

  • This also simplifies how tagging is performed when IS_LATEST is
    set

@Iristyle Iristyle requested a review from a team as a code owner March 29, 2022 17:44
@Iristyle Iristyle force-pushed the cleanup-multi-arch-builds branch from 99c5481 to ef5afec Compare March 29, 2022 17:44
@Iristyle Iristyle changed the title Fix multiple architecture Docker builds (maint) Fix multiple architecture Docker builds Mar 29, 2022
docker/Makefile Outdated
@@ -53,26 +53,16 @@ build: prep
docker buildx build \
${DOCKER_BUILD_FLAGS} \
--load \
Copy link
Contributor Author

@Iristyle Iristyle Mar 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not quite right, because --load cannot be used in conjunction with --platform and multiple architectures as described in docker/buildx#59. The local daemon doesn't support images for multiple architectures in this way.

I think we first have to push images to a local registry for all architectures, then load the appropriate one to be used with tests.

I'll tweak this a bit!

@Iristyle Iristyle force-pushed the cleanup-multi-arch-builds branch 2 times, most recently from 1f5ab70 to 32f5347 Compare March 30, 2022 01:24
@Iristyle
Copy link
Contributor Author

Ok @mwaggett I think I have this dialed in correctly now!

In the end I'm not sure how useful an ARM r10k image is given we can't produce an ARM puppetserver or puppetdb ... but at least with these fixes we'll correctly push the ARM image under the same tag as the other images. Adding -arm to the end is antithetical to how Docker Hub tagging should work

/cc @justinstoller

@Iristyle
Copy link
Contributor Author

Also it looks like we should prefer not to use QEMU should we ever decide to build ARM images for reals...

https://www.docker.com/blog/faster-multi-platform-builds-dockerfile-cross-compilation-guide/

# this will use cached output from build target if it exists
push-image: platform=linux/amd64,linux/arm64
push-image: output_type=registry
push-image: prep build
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unclear on how pushing happens now..

Copy link
Contributor Author

@Iristyle Iristyle Mar 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, so the docker buildx build cli is a bit obtuse. In the OG v1 Docker build process, you would:

Our process to this point has been pretty similar, but we rely on use of docker buildx cli (confusingly the old docker build cli has a different set of options but can also run on top of the buildx engine)

  • docker buildx build - use a buildx builder to place the image build in local daemon
  • docker push - copy image from local daemon to docker hub

However, docker buildx build is really designed to be able to write the output of the build to even more output locations:

  • local (file system)
  • tar (tarball)
  • oci (tarball)
  • docker (docker tarball, automatically loaded into the daemon)
  • image
  • registry

There are shorthand switches that you can use like --load (equivalent to --output=type=docker) and --push (equivalent to --output=type=registry). So the new process is:

  • docker buildx build --output=type=docker to produce only the amd64 image for use in tests. It gets built by the builder and loaded into the local daemon.
  • docker buildx build --output=type=registry to produce a multi-platform image and push it to docker hub. The previous amd64 layers are cached / reused (so that part should effectively be a no-op) in our pipelines, but the arm64 bits will be generated fresh.

This seems like the best blend of usability / sensible builds. I can add a few more comments if it makes it clearer.

One alternative I thought about was to always build both images, writing them to local disk, then loading in just the amd64 for tests. Then build is always used for building and push-image is always used for pushing cached artifacts... the only problem is that most environments will not have the tools installed to support multi-platform image building (i.e. it won't be easy for someone who pulls the repo to just run make test), so I opted for the approach that I did.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That all makes sense to me, thanks so much for explaining!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment noting that --output=type=registry is equivalent to --push, like you've done for --load, would probably be helpful!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're good to go now

 - Using different tags for different architectures is wrong. Tags
   in Docker support multiple architectures.

   Using one buildx command for multiple architectures is entirely
   supported by docker.

   Note - this also fixes build output in TravisCI by ensuring
   that ARM builds use --progress plain.

 - By default, so as to not break developer workflows, the build target
   will only produce a container for linux/amd64 architecture and will
   output to the docker daemon (aka --load). When calling the
   push-image target, multiple architectures will be built and the
   output will be pushed to the docker registry (i.e. docker hub)
   instead, (aka --push)

   buildx doesn't support loading multiple architectures into the
   local daemon as described in:

   docker/buildx#59

 - This also simplifies how tagging is performed when IS_LATEST is
   set
@Iristyle Iristyle force-pushed the cleanup-multi-arch-builds branch from 32f5347 to 038aace Compare March 31, 2022 16:31
@mwaggett mwaggett merged commit 14bc197 into main Mar 31, 2022
@Iristyle Iristyle deleted the cleanup-multi-arch-builds branch March 31, 2022 17:26
@Iristyle
Copy link
Contributor Author

Iristyle commented Mar 31, 2022

Log from https://github.com/puppetlabs/r10k/runs/5775165567?check_suite_focus=true#step:10:294 would seem to indicate that it was pushed

#28 exporting to image
#28 pushing layers 4.9s done
#28 pushing manifest for docker.io/puppet/r10k:edge@sha256:a6739d21beda7ddfdfb155df3dd6a0e51106d255184c7971d608014b61fc845c
#28 pushing manifest for docker.io/puppet/r10k:edge@sha256:a6739d21beda7ddfdfb155df3dd6a0e51106d255184c7971d608014b61fc845c 1.1s done
#28 DONE 9.2s

I don't see any errors, but I also don't see any new builds for the edge tag at https://hub.docker.com/r/puppet/r10k/tags either.

@Iristyle
Copy link
Contributor Author

Iristyle commented Mar 31, 2022

Last previous successful push from an edge build was in https://github.com/puppetlabs/r10k/runs/5759215773?check_suite_focus=true#step:10:45

@Iristyle
Copy link
Contributor Author

Iristyle commented Apr 4, 2022

I wonder if this is somehow related to the content that was in the layers?

Given the comment / example at docker/buildx#799 (comment), I think that this should be working just fine.

Triggered a fresh run at https://github.com/puppetlabs/r10k/runs/5822393254?check_suite_focus=true just to see what happens... but we may have to wait for another r10k PR to land to really check things out again @mwaggett

@Iristyle
Copy link
Contributor Author

Iristyle commented Apr 4, 2022

Ok so it worked the way it was supposed to that time... logs didn't look any different.

#28 exporting manifest sha256:ba71b3e506ea5b857d004e44ec52da737cd5e84aa5e7b140dc923f120fde8aac done
#28 exporting config sha256:1a1c18c3e6ac4c9282c566e949f4ba48f0b01aa758a5f959000671a47e0fe24f done
#28 exporting manifest sha256:ff991d67e0da22ab7da8315c1dd88aed29066d2045bdaa7d3823d7fda81e4 done
#28 exporting config sha256:c3c121fb14a62055caf2ca20c99751e2a629a33a8063742e30aaead155904457 done
#28 exporting manifest list sha256:f0497e4b12d8398c2b44125e580fd0ebe45587869666c01ab37661428c27afa1 done

...


#28 pushing manifest for docker.io/puppet/r10k:edge@sha256:f0497e4b12d8398c2b44125e580fd0ebe45587869666c01ab37661428c27afa1

Maybe @underscorgan deleted the images by accident during cleanup or something similar.

Looks like we're good now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants