Skip to content

Provide a way to cache the installation #50

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ComFreek opened this issue Jul 19, 2020 · 22 comments · Fixed by #54
Closed

Provide a way to cache the installation #50

ComFreek opened this issue Jul 19, 2020 · 22 comments · Fixed by #54
Labels
enhancement New feature or request

Comments

@ComFreek
Copy link

I currently have the workflow file below. Now I am still fiddling with it and whenever I add or change a step that is even unrelated to the msys2/setup-msys2 step, in the next run this action will still redownload and reinstall all packages.

name: Build Binaries and Deploy

on:
  push:
    branches: [ master ]
  pull_request:
    branches: [ master ]

jobs:

  windows-build: 
    runs-on: windows-latest

    steps:
    - uses: actions/checkout@v2
      with:
        submodules: recursive
      
    - uses: msys2/setup-msys2@v1
      with:
        msystem: MINGW64
        update: true
        cache: true
        install: "git diffutils mingw-w64-x86_64-clang make mingw-w64-x86_64-cmake mingw-w64-x86_64-boost mingw-w64-x86_64-mesa mingw-w64-x86_64-openexr mingw-w64-x86_64-intel-tbb mingw-w64-x86_64-glm mingw-w64-x86_64-glew mingw-w64-x86_64-dbus patch mingw-w64-x86_64-openvdb"
@lazka
Copy link
Member

lazka commented Jul 19, 2020

Afair we currently only cache the downloaded packages, which doesn't save much. The reinstall will always happen.

@ComFreek
Copy link
Author

Oh, are there reasons against caching the installations too?

@lazka
Copy link
Member

lazka commented Jul 19, 2020

Assuming you install some software or change some files in the installation we'd restore that on the next run, but ideally a cache would be transparent. Ideas welcome. Think like you'd want to cache "/" in an Arch Linux install.

@lazka
Copy link
Member

lazka commented Jul 19, 2020

I'll look into enabling package caching by default. That will at least avoid the confusion regarding the "cache" option we provide (by removing it..)

@ComFreek
Copy link
Author

As an end user who just uses MSYS2 to compile C files using CMake, I don't have too much knowledge about what paths there are overall.
I guess running CMake and gcc/clang should not change files in the installation directory at all, right?

Thank you for looking into it! Really appreciated.

@lazka
Copy link
Member

lazka commented Jul 19, 2020

I guess running CMake and gcc/clang should not change files in the installation directory at all, right?

Unless you install with it, no. It might still change /home and /tmp. Point is we can't cover all cases because the user can do everything. We'll probably have to expose the cache functionality to the user and let the user decide if and when keeping the whole directory around is OK.

@lazka lazka changed the title Don't eagerly invalidate cache if workflow file is changed Provide a way to cache the installation Jul 19, 2020
@lazka lazka added the enhancement New feature or request label Jul 19, 2020
@lazka
Copy link
Member

lazka commented Jul 19, 2020

I've created #51 for removing the cache option

@lazka
Copy link
Member

lazka commented Jul 19, 2020

I think a good strategy for a full install cache would be to expose a cache-key and cache-restore-keys option which we prefix and pass through to @action/cache. This way the user controls when and between which runs things are cached and we just take care of the "what" and "how".

Open question if we go that way: "pacman -Scc" would be a good idea to clear the package cache, so we don't cache twice. And maybe we should nuke /home on restore? And what to do if an update after the cache restore fails? Maybe then just nuke everything and start new. -> all in all, tricky, but not impossible.

@ComFreek
Copy link
Author

ComFreek commented Jul 19, 2020

Sounds good, I guess in my use case I would just pick a constant key like cache-key: deadbeef, no?

@lazka
Copy link
Member

lazka commented Jul 19, 2020

Yeah, probably. Or include the architecture if you want a separate cache for 32bit and 64bit etc.

@eine any thoughts on my suggestion #50 (comment) ?

@eine
Copy link
Collaborator

eine commented Jul 19, 2020

I'm not sure... My initial try was to use a single key and retrieve/update it every time. That didn't work. Then, I thought about letting users specify a key. But that was not intuitive. The point, as commented in #23, is that it seems not possible to share a cache between jobs or workflows. It is neither possible to update a key. Hence, I don't know what's the namespace where the keys are being used. That is, I have no idea if the cache that is restored makes any sense.

@lazka
Copy link
Member

lazka commented Jul 20, 2020

I got confused again with my comments above.. we store the cache before anything uses it, and not after, so we have perfect control of what we store anyway, so please forget everything I said :/

Re sharing: Based on my experiments in #51 sharing caches between different jobs works fine. Here is one run that gets a cache from a different config: https://github.com/msys2/setup-msys2/runs/888874485?check_suite_focus=true According to https://docs.github.com/en/actions/configuring-and-managing-workflows/caching-dependencies-to-speed-up-workflows#restrictions-for-accessing-a-cache caches can also be read from PRs, but updates will be created in the PR scope.

Re key: I think the current approach is similar to what we'd need: Have one key which is used as a restore and one that is unique. The restore should be a combination of all settings, the action version, the installer URL, so you can be sure you just get an older version of a "similar" run. For the unique key we can hash /var/lib/pacman which contains info about what is currently installed and the package database. If the unique changes we create a new cache.

@eine
Copy link
Collaborator

eine commented Jul 21, 2020

@lazka, as far as I understand the proposal is to cache the installed dependencies. Does that mean caching / or some specific dir? If / is cached, it wouldn't make much sense to download the installer in future runs. Hence, maybe the version of the installer should be part of the cache restore key. What do you think?

@lazka
Copy link
Member

lazka commented Jul 21, 2020

Hence, maybe the version of the installer should be part of the cache restore key. What do you think?

Yeah, that's what I had in mind by adding "the installer URL".

@eine
Copy link
Collaborator

eine commented Jul 21, 2020

I like it, specially because retrieving the full installation from the cache will significantly reduce the traffic in msys2's servers. Users will retrieve the installer once only. Do you want to prototype it in a PR and I will release a "temporal" branch for users to try before merging it to master?

@lazka
Copy link
Member

lazka commented Jul 21, 2020

I'll open a PR once I have something, not sure re branches, let's see how it works out.

lazka added a commit to lazka/setup-msys2 that referenced this issue Jul 23, 2020
In case release=true we try to cache the whole installation.

As cache restore key we use the whole input + the installer checksum, so
we only get a similar setup back.

For saving we hash the content of /var/lib/pacman/local which contains all the info
about which packages are installed.

Fixes msys2#50
lazka added a commit to lazka/setup-msys2 that referenced this issue Jul 23, 2020
In case release=true we try to cache the whole installation.

As cache restore key we use the whole input + the installer checksum, so
we only get a similar setup back.

For saving we hash the content of /var/lib/pacman/local which contains all the info
about which packages are installed.

Fixes msys2#50
lazka added a commit to lazka/setup-msys2 that referenced this issue Jul 23, 2020
In case release=true we try to cache the whole installation.

As cache restore key we use the whole input + the installer checksum, so
we only get a similar setup back.

For saving we hash the content of /var/lib/pacman/local which contains all the info
about which packages are installed.

Fixes msys2#50
lazka added a commit to lazka/setup-msys2 that referenced this issue Jul 23, 2020
In case release=true we try to cache the whole installation.

As cache restore key we use the whole input + the installer checksum, so
we only get a similar setup back.

For saving we hash the content of /var/lib/pacman/local which contains all the info
about which packages are installed.

Fixes msys2#50
lazka added a commit to lazka/setup-msys2 that referenced this issue Jul 23, 2020
In case release=true we try to cache the whole installation.

As cache restore key we use the whole input + the installer checksum, so
we only get a similar setup back.

For saving we hash the content of /var/lib/pacman/local which contains all the info
about which packages are installed.

Fixes msys2#50
lazka added a commit to lazka/setup-msys2 that referenced this issue Jul 23, 2020
In case release=true we try to cache the whole installation.

As cache restore key we use the whole input + the installer checksum, so
we only get a similar setup back.

For saving we hash the content of /var/lib/pacman/local which contains all the info
about which packages are installed.

Fixes msys2#50
lazka added a commit to lazka/setup-msys2 that referenced this issue Jul 26, 2020
In case release=true we try to cache the whole installation.

As cache restore key we use the whole input + the installer checksum, so
we only get a similar setup back.

For saving we hash the content of /var/lib/pacman/local which contains all the info
about which packages are installed.

Fixes msys2#50
lazka added a commit to lazka/setup-msys2 that referenced this issue Jul 26, 2020
In case release=true we try to cache the whole installation.

As cache restore key we use the whole input + the installer checksum, so
we only get a similar setup back.

For saving we hash the content of /var/lib/pacman/local which contains all the info
about which packages are installed.

Fixes msys2#50
lazka added a commit to lazka/setup-msys2 that referenced this issue Jul 26, 2020
In case release=true we try to cache the whole installation.

As cache restore key we use the whole input + the installer checksum, so
we only get a similar setup back.

For saving we hash the content of /var/lib/pacman/local which contains all the info
about which packages are installed.

Fixes msys2#50
@eine eine closed this as completed in #54 Jul 26, 2020
eine pushed a commit that referenced this issue Jul 26, 2020
In case release=true we try to cache the whole installation.

As cache restore key we use the whole input + the installer checksum, so
we only get a similar setup back.

For saving we hash the content of /var/lib/pacman/local which contains all the info
about which packages are installed.

Fixes #50
@lazka
Copy link
Member

lazka commented Jul 27, 2020

Related to this there was #56 and #55 to handle corner cases that are more likely with longer living installations. I think this is good to go now.

@tavrez
Copy link

tavrez commented Sep 27, 2020

Is there any sample on this?

@eine
Copy link
Collaborator

eine commented Sep 27, 2020

@tavrez, caching of packages is done by default. What do you mean with "a sample"?

@tavrez
Copy link

tavrez commented Sep 27, 2020

I thought we can use actions/cache with proper keys to cache the whole setup process, and wanted to see a sample workflow with setup-msys2 and cache with keys and cache folders

@eine
Copy link
Collaborator

eine commented Sep 28, 2020

Since June (see https://github.com/msys2/setup-msys2/blob/master/CHANGELOG.md#110---20200605) the packages are cached by default (using actions/cache) (see #51, #54). For some time, both packages and the full installation were cached. However, there were some issues with lost files (#61), and we decided to disable it (temporaly, at least) (see #63). As a result, this action is already caching as much of the setup as possible, there is no need for users to handle keys and folders explicitly. See, for example: https://github.com/ghdl/ghdl/runs/1170192304?check_suite_focus=true#step:2:104. You see a similar output anywhere you use this action as explained in Usage. Is there some specific use case that we are not covering yet?

@tavrez
Copy link

tavrez commented Sep 28, 2020

Thanks for clarification.

Is there some specific use case that we are not covering yet?

I'll report if I found anything

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants