Skip to content

airbyte-ci: improve gradle in airbyte-ci performance #31439

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
alafanechere opened this issue Oct 16, 2023 · 3 comments
Closed

airbyte-ci: improve gradle in airbyte-ci performance #31439

alafanechere opened this issue Oct 16, 2023 · 3 comments
Assignees

Comments

@alafanechere
Copy link
Contributor

alafanechere commented Oct 16, 2023

We do no yet have an ironclad gradle caching setup in airbyte-ci: it looks like gradle is always re-installed on each execution. We should also confirm the remote gradle cache with dagger is working as expected and evaluate the interest of using the Gradle S3 cache in airbyte-ci.

Definition of done:

  • The Gradle installation is cached
  • Java dependencies are cached
  • We know what build cache caching strategy work the best for Gradle:
    • Remote gradle cache with S3
    • Rsynced dagger cache volumes
    • A combination of both?
@postamar
Copy link
Contributor

I'm not quite sure how gradle's cache works but I understand that it tries to cache everything quite aggressively. We probably don't want airbyte-ci to cache any of the build steps, however caching the dependency downloads would be very useful! If we can spare ourselves all these HTTP GETs then we should. To be able to do that, we need to have gradle perform dependency verification: https://docs.gradle.org/current/userguide/dependency_verification.html

What this does is it downloads all of the poms and jars and computes their checksums and compares them to a manifest checked in git. We'll want to cache all of this and invalidate the cache when the manifest changes, basically.

@postamar
Copy link
Contributor

I was puzzled by how the dagger cache volumes worked in airbyte-ci. They don't seem to cache anything outside of a pipeline run when running on my macbook! Every time I invoke airbyte-ci, it starts by downloading gradle. The initial rsync doesn't transfer any files.

As it turns out, this is because we mount the cache volume before mounting the sources from the git repo! Invert the order and the cache starts working as expected.

@postamar
Copy link
Contributor

This has been completed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants