Skip to content

Document InfluxDB I/O procedures #75

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
May 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/integrate/bi/powerbi-desktop.rst
Original file line number Diff line number Diff line change
Expand Up @@ -192,6 +192,6 @@ The pie chart will be updated automatically, and will produce the following:
.. _World Economic Outlook survey: https://www.imf.org/en/Publications/WEO
.. _Power BI Desktop: https://powerbi.microsoft.com/en-us/desktop/
.. _PostgreSQL ODBC driver: https://odbc.postgresql.org/
.. _downloads section: https://www.postgresql.org/ftp/odbc/versions/msi/
.. _downloads section: https://www.postgresql.org/ftp/odbc/versions.old/msi/
.. _raw data: https://www.imf.org/en/Publications/WEO/weo-database/2017/April/download-entire-database
.. _preprocessed archive: https://cratedb.com/wp-content/uploads/2018/11/copy_from_population_data.zip
6 changes: 4 additions & 2 deletions docs/integrate/etl/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,17 +62,18 @@ Tutorials and resources about configuring the managed variants, Astro and CrateD
- [Tutorial: Replicating data to CrateDB with Debezium and Kafka]
- [Webinar: How to replicate data from other databases to CrateDB with Debezium and Kafka]

## InfluxDB

- {ref}`integrate-influxdb`

## Kestra

- [Setting up data pipelines with CrateDB and Kestra]


## MongoDB

- {ref}`integrate-mongodb`


## MySQL

- {ref}`integrate-mysql`
Expand Down Expand Up @@ -136,6 +137,7 @@ A demo project which uses SSIS and ODBC to read and write data from CrateDB:
```{toctree}
:hidden:

influxdb
mongodb
mysql
```
Expand Down
89 changes: 89 additions & 0 deletions docs/integrate/etl/influxdb.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
(integrate-influxdb)=
(import-influxdb)=
# Import data from InfluxDB

In this quick tutorial we use the InfluxDB I/O subsystem of CrateDB Toolkit
to import data from InfluxDB into CrateDB.

(integrate-influxdb-quickstart)=
## Quickstart

There are multiple ways to get and use the this tool, to avoid
unnecessary installations we will use Docker to run the services.
**Docker is needed for this:**

:::{code} console
docker run --rm --network=host ghcr.io/daq-tools/influxio \
influxio copy \
"http://example:token@localhost:8086/testdrive/demo" \
"crate://crate@localhost:4200/testdrive/demo"
:::

(setup-influxdb)=
### InfluxDB setup

Following is an example configuration of InfluxDB and some sample data
should you need it. **Prerequisite for these to work is a running
instance of InfluxDB.**

Initial InfluxDB configuration:

:::{code} console
docker run --rm -it --publish=8086:8086 --name influxdb\
--env=DOCKER_INFLUXDB_INIT_MODE=setup \
--env=DOCKER_INFLUXDB_INIT_USERNAME=user1 \
--env=DOCKER_INFLUXDB_INIT_PASSWORD=secret1234 \
--env=DOCKER_INFLUXDB_INIT_ORG=example \
--env=DOCKER_INFLUXDB_INIT_BUCKET=testdrive \
--env=DOCKER_INFLUXDB_INIT_ADMIN_TOKEN=token \
--volume="$PWD/var/lib/influxdb2:/var/lib/influxdb2" \
influxdb:latest
:::

Write sample data to InfluxDB:

:::{code} console
docker exec influxdb influx write --bucket=testdrive --org=example --precision=s --token=token "demo,region=amazonas temperature=27.4,humidity=92.3,windspeed=4.5 1588363200"
docker exec influxdb influx write --bucket=testdrive --org=example --precision=s --token=token "demo,region=amazonas temperature=28.2,humidity=88.7,windspeed=4.7 1588549600"
docker exec influxdb influx write --bucket=testdrive --org=example --precision=s --token=token "demo,region=amazonas temperature=27.9,humidity=91.6,windspeed=3.2 1588736000"
docker exec influxdb influx write --bucket=testdrive --org=example --precision=s --token=token "demo,region=amazonas temperature=29.1,humidity=88.1,windspeed=2.4 1588922400"
docker exec influxdb influx write --bucket=testdrive --org=example --precision=s --token=token "demo,region=amazonas temperature=28.6,humidity=93.4,windspeed=2.9 1589108800"

:::

(export-data)=
### Export

Now you can export the data into CrateDB. **Prerequisite for these to work
is a running instance of CrateDB.**

First, create these aliases so the next part is a bit easier:

::::{code} console
alias crash="docker run --rm -it ghcr.io/crate-workbench/cratedb-toolkit:latest crash"
alias ctk="docker run --rm -it ghcr.io/crate-workbench/cratedb-toolkit:latest ctk"
:::

:::{code} console
export CRATEDB_SQLALCHEMY_URL=crate://crate@localhost:4200/testdrive/demo
ctk load table influxdb2://example:token@localhost:8086/testdrive/demo
:::

And verify that data is indeed in your CrateDB cluster:

:::{code} console
crash --command "SELECT * FROM testdrive.demo;"
:::
Comment on lines +74 to +76
Copy link
Member

@amotl amotl May 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the OCI image of Toolkit also includes crash out of the box.
See also #75 (comment).


## More information

There are many more ways to apply the I/O subsystem of CrateDB Toolkit as
pipeline elements in your daily data operations routines. Please visit the
[CrateDB Toolkit I/O Documentation], to learn more about what's possible.

The InfluxDB I/O subsystem is based on the [influxio] package. Please also
check its documentation to learn about more of its capabilities, supporting
you when working with InfluxDB.

[influxio]: https://influxio.readthedocs.io/
[CrateDB Toolkit I/O Documentation]: https://cratedb-toolkit.readthedocs.io/io/influxdb/loader.html