Skip to content

Move images to GitHub releases #105

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

nathanchance
Copy link
Member

This pull request adds support for moving the rootfs images from the repository to GitHub releases. This will make updating the images easier because they will not increase the size of the repository when cloned, so we will not have to be concerned with that anymore.

See the individual patches for full logic and reasoning, I tried to make the commit messages as descriptive as possible.

The only thing I am concerned about is running into rate limits in continuous integration. The rate limit documentation notes that the rate limit for requests using the built-in GITHUB_TOKEN from GitHub Actions is 1,000 requests per hour per repository. That might be too low for our workflows, which would have to query the API to download the rootfs for each job that boots an image. We could workaround this by creating a GitHub account just for continuous-integration2 and add a personal access token from that account to the repo secrets that is used for authentication, which has a requests rate limit of 5,000 requests per hour and per authenticated user.

This is the last shell script in this repository and further
improvements to it are needed. Rewrite it in Python to make future
modifications easier, as the majority of ClangBuiltLinux repositories
use Python as the main scripting language now. Remove the shell linting,
as this repository no longer uses it.

As part of the rewrite, this prepares for moving to GitHub releases for
rootfs uploads by moving the final images to 'buildroot/out' versus
'images' directly.

Signed-off-by: Nathan Chancellor <[email protected]>
…lease

Rather than storing the rootfs files in the repository directly, we can
upload them to a GitHub release, which does not impact the repository
size when closing.

Signed-off-by: Nathan Chancellor <[email protected]>
This should have been checking the existence of the file. This makes a
future diff make a little more sense.

Signed-off-by: Nathan Chancellor <[email protected]>
Now that rootfs images will be uploaded to GitHub releases, both
boot-qemu.py and boot-uml.py need to download the images before running.

The latest rootfs release can be queried from GitHub's API, which
returns JSON with information about the release tag and assets to allow
easy downloading. Unfortunately, GitHub's API has a rate limit, so
always querying is not possible. To account for this, the script will
first query the rate_limit endpoint to see how many queries are
remaining for the current user. If there are queries remaining, the
script will download the rootfs image if it does not already exist or if
the tag in the '.release' file is different from the latest on. If
there are no queries remaining and the image does not already exist, the
script errors and instructs the user how to make an authenticated API
request or manually download the images themselves from the web.

Signed-off-by: Nathan Chancellor <[email protected]>
Now that the images can be released and downloaded dynamically through
GitHub releases, we can remove the copy in the repos, so that future
updates do not increase the download size of the repo.

Signed-off-by: Nathan Chancellor <[email protected]>
Copy link
Member

@nickdesaulniers nickdesaulniers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Thanks for the work on this.

So it should be faster to clone boot-utils for CI2, and the individual InitRD's will be fetched only when needed.

Should we hit API limits, we could also always just revert.

latest_rel = gh_json_rel['tag_name']
if cur_rel != latest_rel:
download_initrd(gh_json_rel, src)
elif not src.exists():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could remove a level of indentation if you did:

if remaining <= 0 && not src.exists():
  raise ...

gh_json_rel = get_gh_json(...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably deserves a better comment but I wrote it this way so that if there are no more queries available under the current rate limit and the rootfs has already been downloaded, the script skips attempting to query the API for the latest release (which would fail). I could add a warning for that situation ("Hit GitHub rate limit for querying latest release, your downloaded image may be out of date"?), which might make things a little clearer as well.

This can help with understanding the parameters and updating them if
they ever need to change.

Signed-off-by: Nathan Chancellor <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
@nathanchance
Copy link
Member Author

So it should be faster to clone boot-utils for CI2, and the individual InitRD's will be fetched only when needed.

Right, we can further speed up CI2 by just downloading the Python files from GitHub rather than cloning the whole repo, as the scripts can be run in a standalone directory now.

Should we hit API limits, we could also always just revert.

Yes, I should be able to back out of this relatively easily should issues arise.

@nathanchance
Copy link
Member Author

I am going to work on the continuous-integration2 changes needed to make this change work before I merge it.

Additionally, I thought a little bit more about generating a file from sha256sum *.zst after the build and checking that in but I think that ends up complicating things a bit, as we would have to generate the file, check it in, commit it, and push it to main before we could use gh to tag a release. I do not think there is anything fundamentally wrong with this approach but it is more complicated than the way I have it written now for potentially little gain. I will leave things the way they are for now and update it if the need arises.

In moving the rootfs images to GitHub releases, we risk hitting GitHub's
API rate limit with GITHUB_TOKEN, which is 1,000 requests per hour per
repository, because each boot test within a workflow will be a separate
call. It is totally possible for us to run 1,000 boots an hour during a
busy workflow period, so this needs special consideration.

To make it easier for CI to cache the results of a GitHub release API
query, add '--gh-json-file' to both boot-qemu.py and boot-uml.py to
allow the tuxsuite parent job to generate boot-utils.json and pass that
along to each child job, so that at worst, each workflow will query the
API three times (once for defconfigs, allconfigs, and distro configs).

Signed-off-by: Nathan Chancellor <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants