|
| 1 | +--- |
| 2 | +description: Start triggering Airbyte jobs with Dagster in minutes |
| 3 | +--- |
| 4 | + |
| 5 | +# Using the Dagster Integration |
| 6 | + |
| 7 | +Airbyte is an official integration in the Dagster project. The Airbyte Integration allows you to trigger synchronization jobs in Airbyte, and this tutorial will walk through configuring your Dagster Ops to do so. |
| 8 | + |
| 9 | +The Airbyte Task documentation on Dagster project can be found [here](https://docs.dagster.io/_apidocs/libraries/dagster-airbyte). |
| 10 | + |
| 11 | +## 1. Set up the tools |
| 12 | + |
| 13 | +First, make sure you have Docker installed. We'll be using the `docker-compose` command, so your install should contain `docker-compose`. |
| 14 | + |
| 15 | +### Start Airbyte |
| 16 | + |
| 17 | +If this is your first time using Airbyte, we suggest going through our [Basic Tutorial](https://github.com/airbytehq/airbyte/tree/e378d40236b6a34e1c1cb481c8952735ec687d88/docs/quickstart/getting-started.md). This tutorial will use the Connection set up in the basic tutorial. |
| 18 | + |
| 19 | +For the purposes of this tutorial, set your Connection's **sync frequency** to **manual**. Dagster will be responsible for manually triggering the Airbyte job. |
| 20 | + |
| 21 | +### Install Dagster |
| 22 | + |
| 23 | +If you don't have a Dagster installed, we recommend following this [guide](https://docs.dagster.io/getting-started) to set one up. |
| 24 | + |
| 25 | +## 2. Create the Dagster Op to trigger your Airbyte job |
| 26 | + |
| 27 | +### Creating a simple Dagster DAG to run an Airbyte Sync Job |
| 28 | + |
| 29 | +Create a new folder called `airbyte_dagster` and create a file `airbyte_dagster.py`. |
| 30 | + |
| 31 | +```python |
| 32 | +from dagster import job |
| 33 | +from dagster_airbyte import airbyte_resource, airbyte_sync_op |
| 34 | + |
| 35 | +my_airbyte_resource = airbyte_resource.configured( |
| 36 | + { |
| 37 | + "host": {"env": "AIRBYTE_HOST"}, |
| 38 | + "port": {"env": "AIRBYTE_PORT"}, |
| 39 | + } |
| 40 | +) |
| 41 | +sync_foobar = airbyte_sync_op.configured({"connection_id": "your-connection-uuid"}, name="sync_foobar") |
| 42 | + |
| 43 | +@job(resource_defs={"airbyte": my_airbyte_resource}) |
| 44 | +def my_simple_airbyte_job(): |
| 45 | + sync_foobar() |
| 46 | + |
| 47 | +``` |
| 48 | + |
| 49 | +The Airbyte Dagster Resource accepts the following parameters: |
| 50 | + |
| 51 | +* `host`: The host URL to your Airbyte instance. |
| 52 | +* `port`: The port value you have selected for your Airbyte instance. |
| 53 | +* `use_https`: If your server use secure HTTP connection. |
| 54 | +* `request_max_retries`: The maximum number of times requests to the Airbyte API should be retried before failing. |
| 55 | +* `request_retry_delay`: Time in seconds to wait between each request retry. |
| 56 | + |
| 57 | +The Airbyte Dagster Op accepts the following parameters: |
| 58 | +* `connection_id`: The Connection UUID you want to trigger |
| 59 | +* `poll_interval`: The time in seconds that will be waited between successive polls. |
| 60 | +* `poll_timeout`: he maximum time that will waited before this operation is timed out. |
| 61 | + |
| 62 | +After running the file, `dagster job execute -f airbyte_dagster.py ` this will trigger the job with Dagster. |
| 63 | + |
| 64 | +## That's it! |
| 65 | + |
| 66 | +Don't be fooled by our simple example of only one Dagster Flow. Airbyte is a powerful data integration platform supporting many sources and destinations. The Airbyte Dagster Integration means Airbyte can now be easily used with the Dagster ecosystem - give it a shot! |
| 67 | + |
| 68 | +We love to hear any questions or feedback on our [Slack](https://slack.airbyte.io/). We're still in alpha, so if you see any rough edges or want to request a connector, feel free to create an issue on our [Github](https://github.com/airbytehq/airbyte) or thumbs up an existing issue. |
| 69 | + |
0 commit comments