Skip to content

Commit 9179ec5

Browse files
authored
create auto_merge package (#38019)
1 parent debca39 commit 9179ec5

File tree

13 files changed

+1069
-2
lines changed

13 files changed

+1069
-2
lines changed

airbyte-ci/README.md

+1
Original file line numberDiff line numberDiff line change
@@ -15,3 +15,4 @@ The installation instructions for the `airbyte-ci` CLI tool cal be found here
1515
| [`connectors_qa`](connectors/connectors_qa/) | A tool to verify connectors have sounds assets and metadata. |
1616
| [`metadata_service`](connectors/metadata_service/) | Tools to generate connector metadata and registry. |
1717
| [`pipelines`](connectors/pipelines/) | Airbyte CI pipelines, including formatting, linting, building, testing connectors, etc. Connector acceptance tests live here. |
18+
| [`auto_merge`](connectors/auto_merge/) | A tool to automatically merge connector pull requests. |
+49
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
# `Auto merge`
2+
3+
## Purpose
4+
5+
This Python package is made to merge pull requests automatically on the Airbyte Repo. It is used in
6+
the [following workflow](.github/workflows/auto_merge.yml).
7+
8+
A pull request is currently considered as auto-mergeable if:
9+
10+
- It has the `auto-merge` Github label
11+
- It only modifies files in connector-related directories
12+
- All the required checks have passed
13+
14+
We want to auto-merge a specific set of connector pull requests to simplify the connector updates in
15+
the following use cases:
16+
17+
- Pull requests updating Python dependencies or the connector base image
18+
- Community contributions when they've been reviewed and approved by our team but CI is still
19+
running: to avoid an extra review iteration just to check CI status.
20+
21+
## Install and usage
22+
23+
### Get a Github token
24+
25+
You need to create a Github token with the following permissions:
26+
27+
- Read access to the repository to list open pull requests and their statuses
28+
- Write access to the repository to merge pull requests
29+
30+
### Local install and run
31+
32+
```
33+
poetry install
34+
export GITHUB_TOKEN=<your_github_token>
35+
# By default no merge will be done, you need to set the AUTO_MERGE_PRODUCTION environment variable to true to actually merge the PRs
36+
poetry run auto-merge
37+
```
38+
39+
### In CI
40+
41+
```
42+
export GITHUB_TOKEN=<your_github_token>
43+
export AUTO_MERGE_PRODUCTION=true
44+
poetry install
45+
poetry run auto-merge
46+
```
47+
48+
The execution will set the `GITHUB_STEP_SUMMARY` env var with a markdown summary of the PRs that
49+
have been merged.

airbyte-ci/connectors/auto_merge/poetry.lock

+754
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
[tool.poetry]
2+
name = "auto-merge"
3+
version = "0.1.0"
4+
description = ""
5+
authors = ["Airbyte <[email protected]>"]
6+
readme = "README.md"
7+
packages = [
8+
{ include = "auto_merge", from = "src" },
9+
]
10+
11+
[tool.poetry.dependencies]
12+
python = "^3.10"
13+
pygithub = "^2.3.0"
14+
anyio = "^4.3.0"
15+
16+
17+
[tool.poetry.group.dev.dependencies]
18+
mypy = "^1.10.0"
19+
ruff = "^0.4.3"
20+
pytest = "^8.2.0"
21+
pyinstrument = "^4.6.2"
22+
23+
[tool.ruff.lint]
24+
select = [
25+
"I" # isort
26+
]
27+
28+
[tool.poetry.scripts]
29+
auto-merge = "auto_merge.main:auto_merge"
30+
31+
[build-system]
32+
requires = ["poetry-core"]
33+
build-backend = "poetry.core.masonry.api"
34+
35+
[tool.poe.tasks]
36+
test = "pytest tests"
37+
type_check = "mypy src --disallow-untyped-defs"
38+
lint = "ruff check src"
39+
40+
[tool.airbyte_ci]
41+
optional_poetry_groups = ["dev"]
42+
poe_tasks = ["type_check", "lint",]

airbyte-ci/connectors/auto_merge/src/auto_merge/__init__.py

Whitespace-only changes.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
# Copyright (c) 2024 Airbyte, Inc., all rights reserved.
2+
3+
from __future__ import annotations
4+
5+
AIRBYTE_REPO = "airbytehq/airbyte"
6+
AUTO_MERGE_LABEL = "auto-merge"
7+
BASE_BRANCH = "master"
8+
CONNECTOR_PATH_PREFIXES = {
9+
"airbyte-integrations/connectors",
10+
"docs/integrations/sources",
11+
"docs/integrations/destinations",
12+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# Copyright (c) 2024 Airbyte, Inc., all rights reserved.
2+
3+
from __future__ import annotations
4+
5+
import os
6+
7+
GITHUB_TOKEN = os.environ["GITHUB_TOKEN"]
8+
PRODUCTION = os.environ.get("AUTO_MERGE_PRODUCTION", "false").lower() == "true"
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# Copyright (c) 2024 Airbyte, Inc., all rights reserved.
2+
3+
from __future__ import annotations
4+
5+
import time
6+
from typing import TYPE_CHECKING
7+
8+
if TYPE_CHECKING:
9+
from github.PullRequest import PullRequest
10+
11+
12+
def generate_job_summary_as_markdown(merged_prs: list[PullRequest]) -> str:
13+
"""Generate a markdown summary of the merged PRs
14+
15+
Args:
16+
merged_prs (list[PullRequest]): The PRs that were merged
17+
18+
Returns:
19+
str: The markdown summary
20+
"""
21+
summary_time = time.strftime("%Y-%m-%d %H:%M:%S")
22+
header = "# Auto-merged PRs"
23+
details = f"Summary generated at {summary_time}"
24+
if not merged_prs:
25+
return f"{header}\n\n{details}\n\n**No PRs were auto-merged**\n"
26+
merged_pr_list = "\n".join([f"- [#{pr.number} - {pr.title}]({pr.html_url})" for pr in merged_prs])
27+
return f"{header}\n\n{details}\n\n{merged_pr_list}\n"
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
# Copyright (c) 2024 Airbyte, Inc., all rights reserved.
2+
3+
from __future__ import annotations
4+
5+
import logging
6+
import os
7+
import time
8+
from collections.abc import Iterable, Iterator
9+
from contextlib import contextmanager
10+
from typing import TYPE_CHECKING
11+
12+
from github import Auth, Github
13+
14+
from .consts import AIRBYTE_REPO, AUTO_MERGE_LABEL, BASE_BRANCH, CONNECTOR_PATH_PREFIXES
15+
from .env import GITHUB_TOKEN, PRODUCTION
16+
from .helpers import generate_job_summary_as_markdown
17+
from .pr_validators import ENABLED_VALIDATORS
18+
19+
if TYPE_CHECKING:
20+
from github.Commit import Commit as GithubCommit
21+
from github.PullRequest import PullRequest
22+
from github.Repository import Repository as GithubRepo
23+
24+
logging.basicConfig()
25+
logger = logging.getLogger("auto_merge")
26+
logger.setLevel(logging.INFO)
27+
28+
29+
@contextmanager
30+
def github_client() -> Iterator[Github]:
31+
client = None
32+
try:
33+
client = Github(auth=Auth.Token(GITHUB_TOKEN), seconds_between_requests=0)
34+
yield client
35+
finally:
36+
if client:
37+
client.close()
38+
39+
40+
def check_if_pr_is_auto_mergeable(head_commit: GithubCommit, pr: PullRequest, required_checks: set[str]) -> bool:
41+
"""Run all enabled validators and return if they all pass.
42+
43+
Args:
44+
head_commit (GithubCommit): The head commit of the PR
45+
pr (PullRequest): The PR to check
46+
required_checks (set[str]): The set of required passing checks
47+
48+
Returns:
49+
bool: True if the PR is auto-mergeable, False otherwise
50+
"""
51+
for validator in ENABLED_VALIDATORS:
52+
is_valid, error = validator(head_commit, pr, required_checks)
53+
if not is_valid:
54+
if error:
55+
logger.info(f"PR #{pr.number} - {error}")
56+
return False
57+
return True
58+
59+
60+
def process_pr(repo: GithubRepo, pr: PullRequest, required_passing_contexts: set[str], dry_run: bool) -> None | PullRequest:
61+
"""Process a PR to see if it is auto-mergeable and merge it if it is.
62+
63+
Args:
64+
repo (GithubRepo): The repository the PR is in
65+
pr (PullRequest): The PR to process
66+
required_passing_contexts (set[str]): The set of required passing checks
67+
dry_run (bool): Whether to actually merge the PR or not
68+
69+
Returns:
70+
None | PullRequest: The PR if it was merged, None otherwise
71+
"""
72+
logger.info(f"Processing PR #{pr.number}")
73+
head_commit = repo.get_commit(sha=pr.head.sha)
74+
if check_if_pr_is_auto_mergeable(head_commit, pr, required_passing_contexts):
75+
if not dry_run:
76+
pr.merge()
77+
logger.info(f"PR #{pr.number} was auto-merged")
78+
return pr
79+
else:
80+
logger.info(f"PR #{pr.number} is auto-mergeable but dry-run is enabled")
81+
return None
82+
83+
84+
def back_off_if_rate_limited(github_client: Github) -> None:
85+
"""Sleep if the rate limit is reached
86+
87+
Args:
88+
github_client (Github): The Github client to check the rate limit of
89+
"""
90+
remaining_requests, _ = github_client.rate_limiting
91+
if remaining_requests < 100:
92+
logging.warning(f"Rate limit almost reached. Remaining requests: {remaining_requests}")
93+
if remaining_requests == 0:
94+
logging.warning(f"Rate limited. Sleeping for {github_client.rate_limiting_resettime - time.time()} seconds")
95+
time.sleep(github_client.rate_limiting_resettime - time.time())
96+
return None
97+
98+
99+
def auto_merge() -> None:
100+
"""Main function to auto-merge PRs that are candidates for auto-merge.
101+
If the AUTO_MERGE_PRODUCTION environment variable is not set to "true", this will be a dry run.
102+
"""
103+
dry_run = PRODUCTION is False
104+
with github_client() as gh_client:
105+
repo = gh_client.get_repo(AIRBYTE_REPO)
106+
main_branch = repo.get_branch(BASE_BRANCH)
107+
logger.info(f"Fetching required passing contexts for {BASE_BRANCH}")
108+
required_passing_contexts = set(main_branch.get_required_status_checks().contexts)
109+
candidate_issues = gh_client.search_issues(f"repo:{AIRBYTE_REPO} is:pr label:{AUTO_MERGE_LABEL} base:{BASE_BRANCH} state:open")
110+
prs = [issue.as_pull_request() for issue in candidate_issues]
111+
logger.info(f"Found {len(prs)} open PRs targeting {BASE_BRANCH} with the {AUTO_MERGE_LABEL} label")
112+
merged_prs = []
113+
for pr in prs:
114+
back_off_if_rate_limited(gh_client)
115+
if merged_pr := process_pr(repo, pr, required_passing_contexts, dry_run):
116+
merged_prs.append(merged_pr)
117+
if PRODUCTION:
118+
os.environ["GITHUB_STEP_SUMMARY"] = generate_job_summary_as_markdown(merged_prs)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
# Copyright (c) 2024 Airbyte, Inc., all rights reserved.
2+
3+
from __future__ import annotations
4+
5+
from typing import TYPE_CHECKING, Optional, Tuple
6+
7+
from .consts import AUTO_MERGE_LABEL, BASE_BRANCH, CONNECTOR_PATH_PREFIXES
8+
9+
if TYPE_CHECKING:
10+
from github.Commit import Commit as GithubCommit
11+
from github.PullRequest import PullRequest
12+
13+
14+
def has_auto_merge_label(head_commit: GithubCommit, pr: PullRequest, required_checks: set[str]) -> Tuple[bool, Optional[str]]:
15+
has_auto_merge_label = any(label.name == AUTO_MERGE_LABEL for label in pr.labels)
16+
if not has_auto_merge_label:
17+
return False, f"does not have the {AUTO_MERGE_LABEL} label"
18+
return True, None
19+
20+
21+
def targets_main_branch(head_commit: GithubCommit, pr: PullRequest, required_checks: set[str]) -> Tuple[bool, Optional[str]]:
22+
if not pr.base.ref == BASE_BRANCH:
23+
return False, f"does not target {BASE_BRANCH}"
24+
return True, None
25+
26+
27+
def only_modifies_connectors(head_commit: GithubCommit, pr: PullRequest, required_checks: set[str]) -> Tuple[bool, Optional[str]]:
28+
modified_files = pr.get_files()
29+
for file in modified_files:
30+
if not any(file.filename.startswith(prefix) for prefix in CONNECTOR_PATH_PREFIXES):
31+
return False, "is not only modifying connectors"
32+
return True, None
33+
34+
35+
def head_commit_passes_all_required_checks(
36+
head_commit: GithubCommit, pr: PullRequest, required_checks: set[str]
37+
) -> Tuple[bool, Optional[str]]:
38+
successful_status_contexts = [commit_status.context for commit_status in head_commit.get_statuses() if commit_status.state == "success"]
39+
successful_check_runs = [check_run.name for check_run in head_commit.get_check_runs() if check_run.conclusion == "success"]
40+
successful_contexts = set(successful_status_contexts + successful_check_runs)
41+
if not required_checks.issubset(successful_contexts):
42+
return False, "not all required checks passed"
43+
return True, None
44+
45+
46+
# A PR is considered auto-mergeable if:
47+
# - it has the AUTO_MERGE_LABEL
48+
# - it targets the BASE_BRANCH
49+
# - it touches only files in CONNECTOR_PATH_PREFIXES
50+
# - the head commit passes all required checks
51+
52+
# PLEASE BE CAREFUL OF THE VALIDATOR ORDERING
53+
# Let's declared faster checks first as the check_if_pr_is_auto_mergeable function fails fast.
54+
ENABLED_VALIDATORS = [has_auto_merge_label, targets_main_branch, only_modifies_connectors, head_commit_passes_all_required_checks]

airbyte-ci/connectors/pipelines/README.md

+2-1
Original file line numberDiff line numberDiff line change
@@ -745,7 +745,8 @@ E.G.: running Poe tasks on the modified internal packages of the current branch:
745745

746746
| Version | PR | Description |
747747
| ------- | ---------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------- |
748-
| 4.13.0 | [#32715](https://github.com/airbytehq/airbyte/pull/32715) | Tag connector metadata with git info |
748+
| 4.13.1 | [#38020](https://github.com/airbytehq/airbyte/pull/38020) | Add `auto_merge` as an internal package to test. |
749+
| 4.13.0 | [#32715](https://github.com/airbytehq/airbyte/pull/32715) | Tag connector metadata with git info |
749750
| 4.12.7 | [#37787](https://github.com/airbytehq/airbyte/pull/37787) | Remove requirements on dockerhub credentials to run QA checks. |
750751
| 4.12.6 | [#36497](https://github.com/airbytehq/airbyte/pull/36497) | Add airbyte-cdk to list of poetry packages for testing |
751752
| 4.12.5 | [#37785](https://github.com/airbytehq/airbyte/pull/37785) | Set the `--yes-auto-update` flag to `True` by default. |

airbyte-ci/connectors/pipelines/pipelines/airbyte_ci/test/__init__.py

+1
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
from pathlib import Path
66

77
INTERNAL_POETRY_PACKAGES = [
8+
"airbyte-ci/connectors/auto_merge",
89
"airbyte-ci/connectors/pipelines",
910
"airbyte-ci/connectors/base_images",
1011
"airbyte-ci/connectors/common_utils",

airbyte-ci/connectors/pipelines/pyproject.toml

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "poetry.core.masonry.api"
44

55
[tool.poetry]
66
name = "pipelines"
7-
version = "4.13.0"
7+
version = "4.13.1"
88
description = "Packaged maintained by the connector operations team to perform CI for connectors' pipelines"
99
authors = ["Airbyte <[email protected]>"]
1010

0 commit comments

Comments
 (0)