-
Notifications
You must be signed in to change notification settings - Fork 4.6k
connector base image: declare the base image package and implement #30303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 5 commits
5adbcbf
64a7c18
cb9fdf2
ee3d259
1533e75
1557e0e
dbf4a5f
f8dcd3f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
# airbyte-connectors-base-images | ||
|
||
This python package contains the base images used by Airbyte connectors. | ||
It is intended to be used as a python library. | ||
Our connector build pipeline ([`airbyte-ci`](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/pipelines/README.md#L1)) **will** use this library to build the connector images. | ||
Our base images are declared in code, using the [Dagger Python SDK](https://dagger-io.readthedocs.io/en/sdk-python-v0.6.4/). | ||
|
||
|
||
|
||
## Where are the Dockerfiles? | ||
Our base images are not declared using Dockerfiles. | ||
They are declared in code using the [Dagger Python SDK](https://dagger-io.readthedocs.io/en/sdk-python-v0.6.4/). | ||
We prefer this approach because it allows us to interact with base images container as code: we can use python to declare the base images and use the full power of the language to build and test them. | ||
However, we do artificially generate Dockerfiles for debugging and documentation purposes. | ||
|
||
|
||
|
||
### Example for `airbyte/python-connector-base`: | ||
```dockerfile | ||
FROM docker.io/python:3.9.18-slim-bookworm@sha256:44b7f161ed03f85e96d423b9916cdc8cb0509fb970fd643bdbc9896d49e1cad0 | ||
RUN ln -snf /usr/share/zoneinfo/Etc/UTC /etc/localtime | ||
RUN pip install --upgrade pip==23.2.1 | ||
ENV POETRY_VIRTUALENVS_CREATE=false | ||
ENV POETRY_VIRTUALENVS_IN_PROJECT=false | ||
ENV POETRY_NO_INTERACTION=1 | ||
RUN pip install poetry==1.6.1 | ||
``` | ||
|
||
|
||
|
||
## Base images | ||
|
||
|
||
### `airbyte/python-connector-base` | ||
|
||
| Version | Published | Docker Image Address | Changelog | | ||
|---------|-----------|--------------|-----------| | ||
| 1.0.0 | ✅| docker.io/airbyte/python-connector-base:1.0.0@sha256:dd17e347fbda94f7c3abff539be298a65af2d7fc27a307d89297df1081a45c27 | Initial release: based on Python 3.9.18, on slim-bookworm system, with pip==23.2.1 and poetry==1.6.1 | | ||
|
||
|
||
## How to release a new base image version (example for Python) | ||
|
||
### Requirements | ||
* [Docker](https://docs.docker.com/get-docker/) | ||
* [Poetry](https://python-poetry.org/docs/#installation) | ||
* Dockerhub logins | ||
|
||
### Steps | ||
1. `poetry install` | ||
2. Open `base_images/python/bases.py`. | ||
3. Make changes to the `AirbytePythonConnectorBaseImage`, you're likely going to change the `get_container` method to change the base image. | ||
4. Implement the `container` property which must return a `dagger.Container` object. | ||
5. **Recommended**: Add new sanity checks to `run_sanity_check` to confirm that the new version is working as expected. | ||
6. Cut a new base image version by running `poetry run generate-release`. You'll need your DockerHub credentials. | ||
|
||
It will: | ||
- Prompt you to pick which base image you'd like to publish. | ||
- Prompt you for a major/minor/patch/pre-release version bump. | ||
- Prompt you for a changelog message. | ||
- Run the sanity checks on the new version. | ||
- Optional: Publish the new version to DockerHub. | ||
- Regenerate the docs and the registry json file. | ||
7. Commit and push your changes. | ||
8. Create a PR and ask for a review from the Connector Operations team. | ||
|
||
**Please note that if you don't publish your image while cutting the new version you can publish it later with `poetry run publish <repository> <version>`.** | ||
No connector will use the new base image version until its metadata is updated to use it. | ||
If you're not fully confident with the new base image version please: | ||
- please publish it as a pre-release version | ||
- try out the new version on a couple of connectors | ||
- cut a new version with a major/minor/patch bump and publish it | ||
- This steps can happen in different PRs. | ||
|
||
|
||
## Running tests locally | ||
```bash | ||
poetry run pytest | ||
# Static typing checks | ||
poetry run mypy base_images --check-untyped-defs | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nice! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is really helpful to catch the kind of errors a compiler would catch in the java land. |
||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
# | ||
# Copyright (c) 2023 Airbyte, Inc., all rights reserved. | ||
# | ||
|
||
from rich.console import Console | ||
|
||
console = Console() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ❓ why this? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is to have a global console object to log with rich, it has nice output by default. I think its currently only used in commands.py . If it is I'll move it there. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perfect, if its used in multiple places can we move it to its own file? # console.py
from rich.console import Console
global_console = Console() There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I tend to like to declare global things in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fair! Ill chalk this up to still learning the python way :) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,101 @@ | ||
# | ||
# Copyright (c) 2023 Airbyte, Inc., all rights reserved. | ||
# | ||
|
||
"""This module declares common (abstract) classes and methods used by all base images.""" | ||
from __future__ import annotations | ||
|
||
from abc import ABC, abstractmethod | ||
from typing import final | ||
|
||
import dagger | ||
import semver | ||
|
||
from .published_image import PublishedImage | ||
|
||
|
||
class AirbyteConnectorBaseImage(ABC): | ||
"""An abstract class that represents an Airbyte base image. | ||
Please do not declare any Dagger with_exec instruction in this class as in the abstract class context we have no guarantee about the underlying system used in the base image. | ||
""" | ||
|
||
@final | ||
def __init__(self, dagger_client: dagger.Client, version: semver.VersionInfo): | ||
"""Initializes the Airbyte base image. | ||
|
||
Args: | ||
dagger_client (dagger.Client): The dagger client used to build the base image. | ||
version (semver.VersionInfo): The version of the base image. | ||
""" | ||
self.dagger_client = dagger_client | ||
self.version = version | ||
|
||
# INSTANCE PROPERTIES: | ||
|
||
@property | ||
def name_with_tag(self) -> str: | ||
"""Returns the full name of the Airbyte base image, with its tag. | ||
|
||
Returns: | ||
str: The full name of the Airbyte base image, with its tag. | ||
""" | ||
return f"{self.repository}:{self.version}" | ||
|
||
# MANDATORY SUBCLASSES ATTRIBUTES / PROPERTIES: | ||
|
||
@property | ||
@abstractmethod | ||
def root_image(self) -> PublishedImage: | ||
"""Returns the base image used to build the Airbyte base image. | ||
|
||
Raises: | ||
NotImplementedError: Raised if a subclass does not define a 'root_image' attribute. | ||
|
||
Returns: | ||
PublishedImage: The base image used to build the Airbyte base image. | ||
""" | ||
raise NotImplementedError("Subclasses must define a 'root_image' attribute.") | ||
|
||
@property | ||
@abstractmethod | ||
def repository(self) -> str: | ||
"""This is the name of the repository where the image will be hosted. | ||
e.g: airbyte/python-connector-base | ||
|
||
Raises: | ||
NotImplementedError: Raised if a subclass does not define an 'repository' attribute. | ||
|
||
Returns: | ||
str: The repository name where the image will be hosted. | ||
""" | ||
raise NotImplementedError("Subclasses must define an 'repository' attribute.") | ||
|
||
# MANDATORY SUBCLASSES METHODS: | ||
|
||
@abstractmethod | ||
def get_container(self, platform: dagger.Platform) -> dagger.Container: | ||
"""Returns the container of the Airbyte connector base image.""" | ||
raise NotImplementedError("Subclasses must define a 'get_container' method.") | ||
|
||
@abstractmethod | ||
async def run_sanity_checks(self, platform: dagger.Platform): | ||
"""Runs sanity checks on the base image container. | ||
This method is called before image publication. | ||
|
||
Args: | ||
base_image_version (AirbyteConnectorBaseImage): The base image version on which the sanity checks should run. | ||
|
||
Raises: | ||
SanityCheckError: Raised if a sanity check fails. | ||
""" | ||
raise NotImplementedError("Subclasses must define a 'run_sanity_checks' method.") | ||
|
||
# INSTANCE METHODS: | ||
@final | ||
def get_base_container(self, platform: dagger.Platform) -> dagger.Container: | ||
"""Returns a container using the base image. This container is used to build the Airbyte base image. | ||
|
||
Returns: | ||
dagger.Container: The container using the base python image. | ||
""" | ||
return self.dagger_client.pipeline(self.name_with_tag).container(platform=platform).from_(self.root_image.address) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,189 @@ | ||
# | ||
# Copyright (c) 2023 Airbyte, Inc., all rights reserved. | ||
# | ||
import argparse | ||
import sys | ||
from typing import Callable, Type | ||
|
||
import anyio | ||
import dagger | ||
import inquirer # type: ignore | ||
import semver | ||
from base_images import bases, console, consts, errors, hacks, publish, utils, version_registry | ||
from jinja2 import Environment, FileSystemLoader | ||
|
||
|
||
async def _generate_docs(dagger_client: dagger.Client): | ||
"""This function will generate the README.md file from the templates/README.md.j2 template. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 😍 Love the doc string! |
||
It will first load all the registries to render the template with up to date information. | ||
""" | ||
docker_credentials = utils.docker.get_credentials() | ||
env = Environment(loader=FileSystemLoader("base_images/templates")) | ||
template = env.get_template("README.md.j2") | ||
rendered_template = template.render({"registries": await version_registry.get_all_registries(dagger_client, docker_credentials)}) | ||
with open("README.md", "w") as readme: | ||
readme.write(rendered_template) | ||
console.log("README.md generated successfully.") | ||
|
||
|
||
async def _generate_release(dagger_client: dagger.Client): | ||
"""This function will cut a new version on top of the previous one. It will prompt the user for release details: version bump, changelog entry. | ||
The user can optionally publish the new version to our remote registry. | ||
If the version is not published its changelog entry is still persisted. | ||
It can later be published by running the publish command. | ||
In the future we might only allow publishing new pre-release versions from this flow. | ||
""" | ||
docker_credentials = utils.docker.get_credentials() | ||
select_base_image_class_answers = inquirer.prompt( | ||
[ | ||
inquirer.List( | ||
"BaseImageClass", | ||
message="Which base image would you like to release a new version for?", | ||
choices=[(BaseImageClass.repository, BaseImageClass) for BaseImageClass in version_registry.MANAGED_BASE_IMAGES], | ||
) | ||
] | ||
) | ||
BaseImageClass = select_base_image_class_answers["BaseImageClass"] | ||
registry = await version_registry.VersionRegistry.load(BaseImageClass, dagger_client, docker_credentials) | ||
latest_entry = registry.latest_entry | ||
|
||
# If theres in no latest entry, it means we have no version yet: the registry is empty | ||
# New version will be cut on top of 0.0.0 so this one will actually never be published | ||
seed_version = semver.VersionInfo.parse("0.0.0") | ||
if latest_entry is None: | ||
latest_version = seed_version | ||
else: | ||
latest_version = latest_entry.version | ||
|
||
if latest_version != seed_version and not latest_entry.published: # type: ignore | ||
console.log( | ||
f"The latest version of {BaseImageClass.repository} ({latest_version}) has not been published yet. Please publish it first before cutting a new version." | ||
) | ||
sys.exit(1) | ||
|
||
new_version_answers = inquirer.prompt( | ||
[ | ||
inquirer.List( | ||
"new_version", | ||
message=f"Which kind of new version would you like to cut? (latest version is {latest_version}))", | ||
choices=[ | ||
("prerelease", latest_version.bump_prerelease()), | ||
("patch", latest_version.bump_patch()), | ||
("minor", latest_version.bump_minor()), | ||
("major", latest_version.bump_major()), | ||
], | ||
), | ||
inquirer.Text("changelog_entry", message="What should the changelog entry be?", validate=lambda _, entry: len(entry) > 0), | ||
inquirer.Confirm("publish_now", message="Would you like to publish it to our remote registry now?"), | ||
] | ||
) | ||
new_version, changelog_entry, publish_now = ( | ||
new_version_answers["new_version"], | ||
new_version_answers["changelog_entry"], | ||
new_version_answers["publish_now"], | ||
) | ||
|
||
base_image_version = BaseImageClass(dagger_client, new_version) | ||
|
||
try: | ||
await publish.run_sanity_checks(base_image_version) | ||
console.log("Sanity checks passed.") | ||
except errors.SanityCheckError as e: | ||
console.log(f"Sanity checks failed: {e}") | ||
console.log("Aborting.") | ||
sys.exit(1) | ||
dockerfile_example = hacks.get_container_dockerfile(base_image_version.get_container(consts.PLATFORMS_WE_PUBLISH_FOR[0])) | ||
|
||
# Add this step we can create a changelog entry: sanity checks passed, image built successfully and sanity checks passed. | ||
changelog_entry = version_registry.ChangelogEntry(new_version, changelog_entry, dockerfile_example) | ||
if publish_now: | ||
published_docker_image = await publish.publish_to_remote_registry(base_image_version) | ||
console.log(f"Published {published_docker_image.address} successfully.") | ||
else: | ||
published_docker_image = None | ||
console.log( | ||
f"Skipping publication. You can publish it later by running `poetry run publish {base_image_version.repository} {new_version}`." | ||
) | ||
|
||
new_registry_entry = version_registry.VersionRegistryEntry(published_docker_image, changelog_entry, new_version) | ||
registry.add_entry(new_registry_entry) | ||
console.log(f"Added {new_version} to the registry.") | ||
await _generate_docs(dagger_client) | ||
console.log("Generated docs successfully.") | ||
|
||
|
||
async def _publish( | ||
dagger_client: dagger.Client, BaseImageClassToPublish: Type[bases.AirbyteConnectorBaseImage], version: semver.VersionInfo | ||
): | ||
"""This function will publish a specific version of a base image to our remote registry. | ||
Users are prompted for confirmation before overwriting an existing version. | ||
If the version does not exist in the registry, the flow is aborted and user is suggested to cut a new version first. | ||
""" | ||
docker_credentials = utils.docker.get_credentials() | ||
registry = await version_registry.VersionRegistry.load(BaseImageClassToPublish, dagger_client, docker_credentials) | ||
registry_entry = registry.get_entry_for_version(version) | ||
if not registry_entry: | ||
console.log(f"No entry found for version {version} in the registry. Please cut a new version first: `poetry run generate-release`") | ||
sys.exit(1) | ||
if registry_entry.published: | ||
force_answers = inquirer.prompt( | ||
[ | ||
inquirer.Confirm( | ||
"force", message="This version has already been published to our remote registry. Would you like to overwrite it?" | ||
), | ||
] | ||
) | ||
if not force_answers["force"]: | ||
console.log("Not overwriting the already exiting image.") | ||
sys.exit(0) | ||
|
||
base_image_version = BaseImageClassToPublish(dagger_client, version) | ||
published_docker_image = await publish.publish_to_remote_registry(base_image_version) | ||
console.log(f"Published {published_docker_image.address} successfully.") | ||
await _generate_docs(dagger_client) | ||
console.log("Generated docs successfully.") | ||
|
||
|
||
async def execute_async_command(command_fn: Callable, *args, **kwargs): | ||
"""This is a helper function that will execute a command function in an async context, required by the use of Dagger.""" | ||
async with dagger.Connection(dagger.Config(log_output=sys.stderr)) as dagger_client: | ||
await command_fn(dagger_client, *args, **kwargs) | ||
|
||
|
||
def generate_docs(): | ||
"""This command will generate the README.md file from the templates/README.md.j2 template. | ||
It will first load all the registries to render the template with up to date information. | ||
""" | ||
anyio.run(execute_async_command, _generate_docs) | ||
|
||
|
||
def generate_release(): | ||
"""This command will cut a new version on top of the previous one. It will prompt the user for release details: version bump, changelog entry. | ||
The user can optionally publish the new version to our remote registry. | ||
If the version is not published its changelog entry is still persisted. | ||
It can later be published by running the publish command. | ||
In the future we might only allow publishing new pre-release versions from this flow. | ||
""" | ||
anyio.run(execute_async_command, _generate_release) | ||
|
||
|
||
def publish_existing_version(): | ||
"""This command is intended to be used when: | ||
- We have a changelog entry for a new version but it's not published yet (for future publish on merge flows). | ||
- We have a good reason to overwrite an existing version in the remote registry. | ||
""" | ||
parser = argparse.ArgumentParser(description="Publish a specific version of a base image to our remote registry.") | ||
parser.add_argument("repository", help="The base image repository name") | ||
parser.add_argument("version", help="The version to publish") | ||
args = parser.parse_args() | ||
|
||
version = semver.VersionInfo.parse(args.version) | ||
BaseImageClassToPublish = None | ||
for BaseImageClass in version_registry.MANAGED_BASE_IMAGES: | ||
if BaseImageClass.repository == args.repository: | ||
BaseImageClassToPublish = BaseImageClass | ||
if BaseImageClassToPublish is None: | ||
console.log(f"Unknown base image name: {args.repository}") | ||
sys.exit(1) | ||
|
||
anyio.run(execute_async_command, _publish, BaseImageClassToPublish, version) |
Uh oh!
There was an error while loading. Please reload this page.