You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: airbyte-ci/connectors/metadata_service/orchestrator/README.md
+67-25Lines changed: 67 additions & 25 deletions
Original file line number
Diff line number
Diff line change
@@ -1,14 +1,15 @@
1
1
# Connector Orchestrator
2
-
This is the Orchestrator for Airbyte metadata built on Dagster.
3
2
3
+
This is the Orchestrator for Airbyte metadata built on Dagster.
4
4
5
5
# Setup
6
6
7
7
## Prerequisites
8
8
9
9
#### Poetry
10
10
11
-
Before you can start working on this project, you will need to have Poetry installed on your system. Please follow the instructions below to install Poetry:
11
+
Before you can start working on this project, you will need to have Poetry installed on your system.
12
+
Please follow the instructions below to install Poetry:
12
13
13
14
1. Open your terminal or command prompt.
14
15
2. Install Poetry using the recommended installation method:
@@ -23,125 +24,165 @@ Alternatively, you can use `pip` to install Poetry:
23
24
pip install --user poetry
24
25
```
25
26
26
-
3. After the installation is complete, close and reopen your terminal to ensure the newly installed `poetry` command is available in your system's PATH.
27
+
3. After the installation is complete, close and reopen your terminal to ensure the newly installed
28
+
`poetry` command is available in your system's PATH.
27
29
28
-
For more detailed instructions and alternative installation methods, please refer to the official Poetry documentation: https://python-poetry.org/docs/#installation
30
+
For more detailed instructions and alternative installation methods, please refer to the official
Once Poetry is installed, you can use it to manage the project's dependencies and virtual environment. To get started, navigate to the project's root directory in your terminal and follow these steps:
33
-
35
+
Once Poetry is installed, you can use it to manage the project's dependencies and virtual
36
+
environment. To get started, navigate to the project's root directory in your terminal and follow
37
+
these steps:
34
38
35
39
## Installation
40
+
36
41
```bash
37
42
poetry install
38
43
cp .env.template .env
39
44
```
40
45
41
46
## Create a GCP Service Account and Dev Bucket
47
+
42
48
Developing against the orchestrator requires a development bucket in GCP.
43
49
44
50
The orchestrator will use this bucket to:
51
+
45
52
- store important output files. (e.g. Reports)
46
53
- watch for changes to the `registry` directory in the bucket.
47
54
48
55
However all tmp files will be stored in a local directory.
49
56
50
57
To create a development bucket:
58
+
51
59
1. Create a GCP Service Account with the following permissions:
52
-
- Storage Admin
53
-
- Storage Object Admin
54
-
- Storage Object Creator
55
-
- Storage Object Viewer
60
+
- Storage Admin
61
+
- Storage Object Admin
62
+
- Storage Object Creator
63
+
- Storage Object Viewer
56
64
2. Create a PUBLIC GCS bucket
57
65
3. Add the service account as a member of the bucket with the following permissions:
58
-
- Storage Admin
59
-
- Storage Object Admin
60
-
- Storage Object Creator
61
-
- Storage Object Viewer
66
+
67
+
- Storage Admin
68
+
- Storage Object Admin
69
+
- Storage Object Creator
70
+
- Storage Object Viewer
62
71
63
72
4. Add the following environment variables to your `.env` file:
64
-
-`METADATA_BUCKET`
65
-
-`GCS_CREDENTIALS`
73
+
-`METADATA_BUCKET`
74
+
-`GCS_CREDENTIALS`
66
75
67
76
Note that the `GCS_CREDENTIALS` should be the raw json string of the service account credentials.
68
77
69
78
Here is an example of how to import the service account credentials into your environment:
The orchestrator (built using Dagster) is responsible for orchestrating various the metadata processes.
86
+
The orchestrator (built using Dagster) is responsible for orchestrating various the metadata
87
+
processes.
88
+
89
+
Dagster has a number of concepts that are important to understand before working on the
90
+
orchestrator.
77
91
78
-
Dagster has a number of concepts that are important to understand before working on the orchestrator.
79
92
1. Assets
80
93
2. Resources
81
94
3. Schedules
82
95
4. Sensors
83
96
5. Ops
84
97
85
-
Refer to the [Dagster documentation](https://docs.dagster.io/concepts) for more information on these concepts.
98
+
Refer to the [Dagster documentation](https://docs.dagster.io/concepts) for more information on these
99
+
concepts.
86
100
87
101
### Starting the Dagster Daemons
102
+
88
103
Start the orchestrator with the following command:
104
+
89
105
```bash
90
106
poetry run dagster dev
91
107
```
92
108
93
109
Then you can access the Dagster UI at http://localhost:3000
94
110
95
-
Note its important to use `dagster dev` instead of `dagit` because `dagster dev` start additional services that are required for the orchestrator to run. Namely the sensor service.
111
+
Note its important to use `dagster dev` instead of `dagit` because `dagster dev` start additional
112
+
services that are required for the orchestrator to run. Namely the sensor service.
96
113
97
114
### Materializing Assets with the UI
98
-
When you navigate to the orchestrator in the UI, you will see a list of assets that are available to be materialized.
115
+
116
+
When you navigate to the orchestrator in the UI, you will see a list of assets that are available to
117
+
be materialized.
99
118
100
119
From here you have the following options
120
+
101
121
1. Materialize all assets
102
122
2. Select a subset of assets to materialize
103
123
3. Enable a sensor to automatically materialize assets
104
124
105
125
### Materializing Assets without the UI
106
126
107
-
In some cases you may want to run the orchestrator without the UI. To learn more about Dagster's CLI commands, see the [Dagster CLI documentation](https://docs.dagster.io/_apidocs/cli).
127
+
In some cases you may want to run the orchestrator without the UI. To learn more about Dagster's CLI
128
+
commands, see the [Dagster CLI documentation](https://docs.dagster.io/_apidocs/cli).
108
129
109
130
## Running Tests
131
+
110
132
```bash
111
133
poetry run pytest
112
134
```
113
135
136
+
## Deploying to Dagster Automatically
137
+
138
+
GitHub Actions is used to automatically deploy the orchestrator to Dagster Cloud
0 commit comments