Skip to content

🎉 Source Gitlab: Ingest All Accessible Groups #11140

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -2478,7 +2478,7 @@
description: "Please enter your basic URL from GitLab instance."
private_token:
type: "string"
title: "Privat Token"
title: "Private Token"
description: "Log into your GitLab account and then generate a personal\
\ Access Token."
airbyte_secret: true
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -36,17 +36,19 @@
Users,
)

from .util import get_group_list

class SourceGitlab(AbstractSource):
def _generate_main_streams(self, config: Mapping[str, Any]) -> Tuple[GitlabStream, GitlabStream]:
gids = list(filter(None, config["groups"].split(" ")))
pids = list(filter(None, config["projects"].split(" ")))

if not pids and not gids:
raise Exception("Either groups or projects need to be provided for connect to Gitlab API")

auth = TokenAuthenticator(token=config["private_token"])
auth_params = dict(authenticator=auth, api_url=config["api_url"])

if not pids and not gids:
gids = get_group_list(**auth_params)

groups = Groups(group_ids=gids, **auth_params)
if gids:
projects = GroupProjects(project_ids=pids, parent_stream=groups, **auth_params)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
import requests

def get_group_list(**kwargs):
headers = kwargs["authenticator"].get_auth_header()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is only used by the Source class, maybe transfer it to inside the class or let inside the source.py file. Don't see a separate file is helping here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to an internal class method on the source


ids = []

r = requests.get(f'https://{kwargs["api_url"]}/api/v4/groups?page=1&per_page=50', headers=headers)
results = r.json()
items = map(lambda i: i['full_path'].replace('/', '%2f'), results)
ids.extend(items)

while 'X-Next-Page' in r.headers and r.headers['X-Next-Page'] != '':
next_page = r.headers['X-Next-Page']
per_page = r.headers['X-Per-Page']
r = requests.get(f'https://{kwargs["api_url"]}/api/v4/groups?page={next_page}&per_page={per_page}', headers=headers)
results = r.json()
items = map(lambda i: i['full_path'].replace('/', '%2f'), results)
ids.extend(items)

return ids
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
headers = kwargs["authenticator"].get_auth_header()
ids = []
r = requests.get(f'https://{kwargs["api_url"]}/api/v4/groups?page=1&per_page=50', headers=headers)
results = r.json()
items = map(lambda i: i['full_path'].replace('/', '%2f'), results)
ids.extend(items)
while 'X-Next-Page' in r.headers and r.headers['X-Next-Page'] != '':
next_page = r.headers['X-Next-Page']
per_page = r.headers['X-Per-Page']
r = requests.get(f'https://{kwargs["api_url"]}/api/v4/groups?page={next_page}&per_page={per_page}', headers=headers)
results = r.json()
items = map(lambda i: i['full_path'].replace('/', '%2f'), results)
ids.extend(items)
return ids
headers = kwargs["authenticator"].get_auth_header()
ids = []
has_next = True
# First request params
per_page = 50
next_page = 1
while has_next:
r = requests.get(f'https://{kwargs["api_url"]}/api/v4/groups?page={next_page}&per_page={per_page}', headers=headers)
next_page = r.headers.get('X-Next-Page')
per_page = r.headers.get('X-Per-Page')
results = r.json()
items = map(lambda i: i['full_path'].replace('/', '%2f'), results)
ids.extend(items)
has_next = 'X-Next-Page' in r.headers and r.headers['X-Next-Page'] != ''
return ids

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored. This was admittedly messy :)

15 changes: 9 additions & 6 deletions docs/integrations/sources/gitlab.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,16 +53,19 @@ Log into Gitlab and then generate a [personal access token](https://docs.gitlab.

Your token should have the `read_api` scope, that Grants read access to the API, including all groups and projects, the container registry, and the package registry.

**Note:** You can specify either Group IDs or Project IDs in the source configuration. If both fields are blank, the connector will retrieve a list of all the groups that are accessible to the configured token and ingest as normal.

## Additional information

GitLab source is working with GitLab API v4. It can also work with self-hosted GitLab API v4.

## Changelog

| Version | Date | Pull Request | Subject |
| :--- | :--- | :--- | :--- |
| 0.1.3 | 2021-12-21 | [8991](https://github.com/airbytehq/airbyte/pull/8991) | Update connector fields title/description |
| 0.1.2 | 2021-10-18 | [7108](https://github.com/airbytehq/airbyte/pull/7108) | Allow all domains to be used as `api_url` |
| 0.1.1 | 2021-10-12 | [6932](https://github.com/airbytehq/airbyte/pull/6932) | Fix pattern field in spec file, remove unused fields from config files, use cache from CDK |
| 0.1.0 | 2021-07-06 | [4174](https://github.com/airbytehq/airbyte/pull/4174) | Initial Release |
| Version | Date | Pull Request | Subject |
|:--------|:-----------|:---------------------------------------------------------| :--- |
| 0.1.4 | 2022-03-15 | [11140](https://github.com/airbytehq/airbyte/pull/11140) | Ingest All Accessible Groups |
| 0.1.3 | 2021-12-21 | [8991](https://github.com/airbytehq/airbyte/pull/8991) | Update connector fields title/description |
| 0.1.2 | 2021-10-18 | [7108](https://github.com/airbytehq/airbyte/pull/7108) | Allow all domains to be used as `api_url` |
| 0.1.1 | 2021-10-12 | [6932](https://github.com/airbytehq/airbyte/pull/6932) | Fix pattern field in spec file, remove unused fields from config files, use cache from CDK |
| 0.1.0 | 2021-07-06 | [4174](https://github.com/airbytehq/airbyte/pull/4174) | Initial Release |