Skip to content

Source Github: add integration tests #34933

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 26 commits into from
Feb 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
9157553
Source Github: add integration tests
artem1205 Feb 5, 2024
e8cbed2
Merge remote-tracking branch 'origin/master' into artem1205/source-gi…
artem1205 Feb 5, 2024
8f85394
Source Github: fix formatting
artem1205 Feb 5, 2024
78596ad
Source Github: fix formatting
artem1205 Feb 5, 2024
ee904f6
Airbyte CDK: add headers
artem1205 Feb 6, 2024
ce41a4f
Source Github: add test for pagination
artem1205 Feb 6, 2024
9f1dee8
Source Github: add test incremental + error
artem1205 Feb 6, 2024
19dca25
Source GitHub: rewrite using responses
artem1205 Feb 6, 2024
d3a75c4
Source GitHub: rewrite using responses
artem1205 Feb 6, 2024
1dd5d94
Revert "Airbyte CDK: add headers"
artem1205 Feb 6, 2024
3834666
Source GitHub: rewrite using responses
artem1205 Feb 6, 2024
c1ef906
Source GitHub: update test
artem1205 Feb 6, 2024
a034ba8
Source GitHub: add transformation check
artem1205 Feb 7, 2024
5707dd8
Source GitHub: Revert pyproject.toml
artem1205 Feb 7, 2024
e321c6d
Source GitHub: Add OrderedRegistry to strictly check order and amount…
artem1205 Feb 7, 2024
775c3a8
Merge remote-tracking branch 'origin/master' into artem1205/source-gi…
artem1205 Feb 7, 2024
eb6772c
Source GitHub: ref
artem1205 Feb 7, 2024
28984f9
Source GitHub: ref
artem1205 Feb 7, 2024
4796b0b
Source GitHub: add bypass reason
artem1205 Feb 8, 2024
7184c40
Merge remote-tracking branch 'origin/master' into artem1205/source-gi…
artem1205 Feb 9, 2024
920b1e0
Source GitHub: ref tests to internal framework
artem1205 Feb 9, 2024
f04026e
Merge remote-tracking branch 'origin/master' into artem1205/source-gi…
artem1205 Feb 12, 2024
edb2549
Source GitHub: ref tests
artem1205 Feb 12, 2024
f972ee7
Source GitHub: update cdk
artem1205 Feb 12, 2024
078d644
Source GitHub: bump version
artem1205 Feb 12, 2024
037bd2a
Merge branch 'master' into artem1205/source-github-integration-tests-…
artem1205 Feb 12, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ acceptance_tests:
extra_records: yes
empty_streams:
- name: "events"
bypass_reason: "Only events created within the past 90 days can be showed"
bypass_reason: "Only events created within the past 90 days can be showed. Stream is tested with integration tests."
ignored_fields:
contributor_activity:
- name: weeks
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ data:
connectorSubtype: api
connectorType: source
definitionId: ef69ef6e-aa7f-4af1-a01d-ef775033524e
dockerImageTag: 1.6.1
dockerImageTag: 1.6.2
dockerRepository: airbyte/source-github
documentationUrl: https://docs.airbyte.com/integrations/sources/github
githubIssueLabel: source-github
Expand Down
10 changes: 5 additions & 5 deletions airbyte-integrations/connectors/source-github/poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions airbyte-integrations/connectors/source-github/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ requires = [ "poetry-core>=1.0.0",]
build-backend = "poetry.core.masonry.api"

[tool.poetry]
version = "1.6.1"
version = "1.6.2"
name = "source-github"
description = "Source implementation for Github."
authors = [ "Airbyte <[email protected]>",]
Expand All @@ -17,7 +17,7 @@ include = "source_github"

[tool.poetry.dependencies]
python = "^3.9,<3.12"
airbyte-cdk = "==0.60.1"
airbyte-cdk = "^0.62.1"
sgqlc = "==16.3"

[tool.poetry.scripts]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,18 +12,8 @@
def rate_limit_mock_response():
rate_limit_response = {
"resources": {
"core": {
"limit": 5000,
"used": 0,
"remaining": 5000,
"reset": 4070908800
},
"graphql": {
"limit": 5000,
"used": 0,
"remaining": 5000,
"reset": 4070908800
}
"core": {"limit": 5000, "used": 0, "remaining": 5000, "reset": 4070908800},
"graphql": {"limit": 5000, "used": 0, "remaining": 5000, "reset": 4070908800},
}
}
responses.add(responses.GET, "https://api.github.com/rate_limit", json=rate_limit_response)
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Copyright (c) 2023 Airbyte, Inc., all rights reserved.

from datetime import datetime
from typing import Any, Dict, List


class ConfigBuilder:
def __init__(self) -> None:
self._config: Dict[str, Any] = {
"credentials": {"option_title": "PAT Credentials", "personal_access_token": "GITHUB_TEST_TOKEN"},
"start_date": "2020-05-01T00:00:00Z",
}

def with_repositories(self, repositories: List[str]) -> "ConfigBuilder":
self._config["repositories"] = repositories
return self

def with_client_secret(self, client_secret: str) -> "ConfigBuilder":
self._config["client_secret"] = client_secret
return self

def with_start_date(self, start_datetime: datetime) -> "ConfigBuilder":
self._config["start_date"] = start_datetime.isoformat()[:-13] + "Z"
return self

def with_branches(self, branches: List[str]) -> "ConfigBuilder":
self._config["branches"] = branches
return self

def with_api_url(self, api_url: str) -> "ConfigBuilder":
self._config["api_url"] = api_url
return self

def build(self) -> Dict[str, Any]:
return self._config
Original file line number Diff line number Diff line change
@@ -0,0 +1,197 @@
# Copyright (c) 2023 Airbyte, Inc., all rights reserved.

import json
from unittest import TestCase

from airbyte_cdk.models import SyncMode
from airbyte_cdk.test.catalog_builder import CatalogBuilder
from airbyte_cdk.test.entrypoint_wrapper import read
from airbyte_cdk.test.mock_http import HttpMocker, HttpRequest, HttpResponse
from airbyte_cdk.test.mock_http.response_builder import find_template
from airbyte_cdk.test.state_builder import StateBuilder
from airbyte_protocol.models import AirbyteStreamStatus, Level, TraceType
from source_github import SourceGithub

from .config import ConfigBuilder

_CONFIG = ConfigBuilder().with_repositories(["airbytehq/integration-test"]).build()


def _create_catalog(sync_mode: SyncMode = SyncMode.full_refresh):
return CatalogBuilder().with_stream(name="events", sync_mode=sync_mode).build()


class EventsTest(TestCase):
def setUp(self) -> None:
"""Base setup for all tests. Add responses for:
1. rate limit checker
2. repositories
3. branches
"""

self.r_mock = HttpMocker()
self.r_mock.__enter__()
self.r_mock.get(
HttpRequest(
url="https://api.github.com/rate_limit",
query_params={},
headers={
"Accept": "application/vnd.github+json",
"X-GitHub-Api-Version": "2022-11-28",
"Authorization": "token GITHUB_TEST_TOKEN",
},
),
HttpResponse(
json.dumps(
{
"resources": {
"core": {"limit": 5000, "used": 0, "remaining": 5000, "reset": 5070908800},
"graphql": {"limit": 5000, "used": 0, "remaining": 5000, "reset": 5070908800},
}
}
),
200,
),
)

self.r_mock.get(
HttpRequest(
url=f"https://api.github.com/repos/{_CONFIG.get('repositories')[0]}",
query_params={"per_page": 100},
),
HttpResponse(json.dumps({"full_name": "airbytehq/integration-test", "default_branch": "master"}), 200),
)

self.r_mock.get(
HttpRequest(
url=f"https://api.github.com/repos/{_CONFIG.get('repositories')[0]}/branches",
query_params={"per_page": 100},
),
HttpResponse(json.dumps([{"repository": "airbytehq/integration-test", "name": "master"}]), 200),
)

def teardown(self):
"""Stops and resets HttpMocker instance."""
self.r_mock.__exit__()

def test_read_full_refresh_no_pagination(self):
"""Ensure http integration and record extraction"""
self.r_mock.get(
HttpRequest(
url=f"https://api.github.com/repos/{_CONFIG.get('repositories')[0]}/events",
query_params={"per_page": 100},
),
HttpResponse(json.dumps(find_template("events", __file__)), 200),
)

source = SourceGithub()
actual_messages = read(source, config=_CONFIG, catalog=_create_catalog())

assert len(actual_messages.records) == 2

def test_read_transformation(self):
"""Ensure transformation applied to all records"""

self.r_mock.get(
HttpRequest(
url=f"https://api.github.com/repos/{_CONFIG.get('repositories')[0]}/events",
query_params={"per_page": 100},
),
HttpResponse(json.dumps(find_template("events", __file__)), 200),
)

source = SourceGithub()
actual_messages = read(source, config=_CONFIG, catalog=_create_catalog())

assert len(actual_messages.records) == 2
assert all(("repository", "airbytehq/integration-test") in x.record.data.items() for x in actual_messages.records)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What behavior does this actually test? If this is a different behavior than assert len(actual_messages.records) == 2 should we split this in another test and have descriptive test names that describe this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, splitted into 2


def test_full_refresh_with_pagination(self):
"""Ensure pagination"""
self.r_mock.get(
HttpRequest(
url=f"https://api.github.com/repos/{_CONFIG.get('repositories')[0]}/events",
query_params={"per_page": 100},
),
HttpResponse(
body=json.dumps(find_template("events", __file__)),
status_code=200,
headers={"Link": '<https://api.github.com/repos/{}/events?page=2>; rel="next"'.format(_CONFIG.get("repositories")[0])},
),
)
self.r_mock.get(
HttpRequest(
url=f"https://api.github.com/repos/{_CONFIG.get('repositories')[0]}/events",
query_params={"per_page": 100, "page": 2},
),
HttpResponse(
body=json.dumps(find_template("events", __file__)),
status_code=200,
),
)
source = SourceGithub()
actual_messages = read(source, config=_CONFIG, catalog=_create_catalog())

assert len(actual_messages.records) == 4

def test_given_state_more_recent_than_some_records_when_read_incrementally_then_filter_records(self):
"""Ensure incremental sync.
Stream `Events` is semi-incremental, so all requests will be performed and only new records will be extracted"""

self.r_mock.get(
HttpRequest(
url=f"https://api.github.com/repos/{_CONFIG.get('repositories')[0]}/events",
query_params={"per_page": 100},
),
HttpResponse(json.dumps(find_template("events", __file__)), 200),
)

source = SourceGithub()
actual_messages = read(
source,
config=_CONFIG,
catalog=_create_catalog(sync_mode=SyncMode.incremental),
state=StateBuilder()
.with_stream_state("events", {"airbytehq/integration-test": {"created_at": "2022-06-09T10:00:00Z"}})
.build(),
)
assert len(actual_messages.records) == 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like there are other outcomes of incremental read that are probably interesting. Should we create other tests for test_when_read_incrementally_then_emit_state_message I don't see unit tests for Events so maybe we want to ensure the state value is correct too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added test for stream_state


def test_when_read_incrementally_then_emit_state_message(self):
"""Ensure incremental sync emits correct stream state message"""

self.r_mock.get(
HttpRequest(
url=f"https://api.github.com/repos/{_CONFIG.get('repositories')[0]}/events",
query_params={"per_page": 100},
),
HttpResponse(json.dumps(find_template("events", __file__)), 200),
)

source = SourceGithub()
actual_messages = read(
source,
config=_CONFIG,
catalog=_create_catalog(sync_mode=SyncMode.incremental),
state=StateBuilder()
.with_stream_state("events", {"airbytehq/integration-test": {"created_at": "2020-06-09T10:00:00Z"}})
.build(),
)
assert actual_messages.state_messages[0].state.data == {'events': {'airbytehq/integration-test': {'created_at': '2022-06-09T12:47:28Z'}}}

def test_read_handles_expected_error_correctly_and_exits_with_complete_status(self):
"""Ensure read() method does not raise an Exception and log message with error is in output"""
self.r_mock.get(
HttpRequest(
url=f"https://api.github.com/repos/{_CONFIG.get('repositories')[0]}/events",
query_params={"per_page": 100},
),
HttpResponse('{"message":"some_error_message"}', 403),
)
source = SourceGithub()
actual_messages = read(source, config=_CONFIG, catalog=_create_catalog())

assert Level.ERROR in [x.log.level for x in actual_messages.logs]
events_stream_complete_message = [x for x in actual_messages.trace_messages if x.trace.type == TraceType.STREAM_STATUS][-1]
assert events_stream_complete_message.trace.stream_status.stream_descriptor.name == 'events'
assert events_stream_complete_message.trace.stream_status.status == AirbyteStreamStatus.COMPLETE
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
[
{
"id": "22249084964",
"type": "PushEvent",
"actor": {
"id": 583231,
"login": "octocat",
"display_login": "octocat",
"gravatar_id": "",
"url": "https://api.github.com/users/octocat",
"avatar_url": "https://avatars.githubusercontent.com/u/583231?v=4"
},
"repo": {
"id": 1296269,
"name": "octocat/Hello-World",
"url": "https://api.github.com/repos/octocat/Hello-World"
},
"payload": {
"push_id": 10115855396,
"size": 1,
"distinct_size": 1,
"ref": "refs/heads/master",
"head": "7a8f3ac80e2ad2f6842cb86f576d4bfe2c03e300",
"before": "883efe034920928c47fe18598c01249d1a9fdabd",
"commits": [
{
"sha": "7a8f3ac80e2ad2f6842cb86f576d4bfe2c03e300",
"author": {
"email": "[email protected]",
"name": "Monalisa Octocat"
},
"message": "commit",
"distinct": true,
"url": "https://api.github.com/repos/octocat/Hello-World/commits/7a8f3ac80e2ad2f6842cb86f576d4bfe2c03e300"
}
]
},
"public": true,
"created_at": "2022-06-09T12:47:28Z"
},
{
"id": "22237752260",
"type": "WatchEvent",
"actor": {
"id": 583231,
"login": "octocat",
"display_login": "octocat",
"gravatar_id": "",
"url": "https://api.github.com/users/octocat",
"avatar_url": "https://avatars.githubusercontent.com/u/583231?v=4"
},
"repo": {
"id": 1296269,
"name": "octocat/Hello-World",
"url": "https://api.github.com/repos/octocat/Hello-World"
},
"payload": {
"action": "started"
},
"public": true,
"created_at": "2022-06-08T23:29:25Z"
}
]
Loading