Skip to content
Tim Fischer edited this page Apr 8, 2025 · 9 revisions

Deployment

Docker-compose is used to orchestrate the frontend, API, celery workers, databases and other services that are used in the Discourse Analysis Tool Suite.

Quickstart

0. Requirements

  • Machine with NVIDIA GPU
  • Docker with NVIDIA Container Toolkit

1. Clone the repository

git clone https://github.com/uhh-lt/dats.git

2. Run setup scripts

./bin/setup-envs.sh --project_name dats --port_prefix 101
./bin/setup-folders.sh

3. Start docker containers

docker compose -f compose.ollama.yml up -d
docker compose -f compose.ray.yml up -d
docker compose -f compose.yml -f compose.production.yml up --wait

4. Open DATS

Open https://localhost:10100/ in your browser

Updating deployed instances

First, locate the DATS directory on the machine and navigate to the docker directory. Then, get the newest code from git:

git switch main
git pull

Update DATS

1. Stop all containers

docker compose -f compose.yml -f compose.production.yml down

2. Update the /docker/.env file You have to update the /docker/.env file manually. Compare it with the .env.example file to find all differences. Then, use nano to change the .env file. Most likely, you need to update the DATS_BACKEND_DOCKER_VERSION and DATS_FRONTEND_DOCKER_VERSION variables to the newest version.

git diff --no-index .env.example .env
nano .env

3. Pull the newest docker containers

docker compose -f compose.yml -f compose.production.yml pull

4. Start all containers

docker compose -f compose.yml -f compose.production.yml up --wait

Now, DATS is updated to the new version. Note that you also may need to update the ray and ollama containers!

Update Ray

Ray only needs to run once per machine. It should always be up-to-date!

1. Stop the ray container

docker compose -f compose.ray.yml down

2. Update the /docker/.env file You have to manually set the DATS_RAY_DOCKER_VERSION environment variable to the newest version, for example with nano:

nano .env

3. Pull the new docker container

docker compose -f compose.ray.yml pull

4. Start Ray

docker compose -f compose.ray.yml up --wait

Now, Ray is updated to the new version. Note that ray only needs to run once per machine!

Update Ollama

Ollama only needs to run once per machine. It should always be up-to-date! However, ollama is not developed by the DATS team and its version number does not match our DATS version. Sometimes, even if we deploy a new DATS version, the Ollama version remains unchanged.

1. Stop the Ollama container

docker compose -f compose.ollama.yml down

2. Pull the new docker container

docker compose -f compose.ollama.yml pull

3. Start Ollama

docker compose -f compose.ollama.yml up --wait

Now, Ollama is updated to the new version. Note that ollama only needs to run once per machine!

Folder structure

The scipt ./bin/setup-folders.sh creates multiple foldes:

  • /backend_repo - User data
  • /models_cache - Cached ML models
  • /ollama_cache - Cached LLMs
  • /spacy_models - Cached spacy models
  • /numba_cache

Configuration

There are two main files to configure DATS in production mode:

  • /docker/.env
  • /backend/src/configs/production.yaml

The .env file overrides frequently changing variables of the production.yaml config.

It is Strongly recommended to change the following configs in .env:

  • SYSTEM_USER_EMAIL
  • SYSTEM_USER_PASSWORD

You can find some additional configurations here. However, we do not expect these to be changed:

  • /docker/compose.yml
  • /docker/compose.production.yml
  • /docker/elasticsearch.yml - Special Elasticsearch configuration
  • /docker/nginx.conf - Special Frontend / NGINX configuration
  • /backend/src/app/preprocessing/ray_model_worker/config_gpu.yaml - Configure ML models

Backups

We provide several scripts to automatically create backups of all databases and uploaded user data. This is the recommended backup process

1. Stop backend and frontend Ensure that the backup process cannot be interrupted by users.

docker compose -f compose.yml -f compose.production.yml stop dats-frontend dats-backend-api

2. Create backups

./bin/backup-postgres.sh
./bin/backup-repo.sh
./bin/backup-elasticsearch.sh
./bin/backup-weaviate.sh

3. Restart containers

docker compose -f compose.yml -f compose.production.yml up --wait

SSO

DATS supports SSO using OAuth2. We tested it with Authentik as the Identity Provider and Single Sign On. We include a compose.authentik.yml to start an Authentik instance, but you can use any service that supports OAuth2/OpenID. This section explains the setup using Authentik.

1. Configure Authentik

  • First, a new application has to be created in Authentik.
  • Use dats as the name and slug, then choose Oauth2/OpenID Provider as the Provider Type.
  • Note the Client ID and Client secret. You will need it in the next step.
  • It is important to leave the private key field empty, as Authlib does not currently support token decryption.
  • Do not specify any groups. DATS does not support roles or groups.

Next, we need to find the metadata/OpenID-config-URL. In Authentik, this can be found under Applications/Provider/dats.

2. Configure DATS

  1. Navigate to the docker directory and open the .env file
  2. Fill the corresponding variables: OIDC_CLIENT_ID, OIDC_CLIENT_SECRET, and OIDC_SERVER_METADATA_URL. Also, set OIDC_ENABLED=True.

Monitoring

DATS is a complex application that consists of various Docker containers that are managed with Docker Compose. A monitoring system that watches the Docker containers' state and health is important for running applications reliably in Docker and not relying on users to report outages. We use Uptime Kuma, a simple, self-hosted, UI-focused monitoring software. It is open-source and can be run as another Docker container. Kuma uses MariaDB to store its data.

1. Configure Uptime Kuma

Kuma is configured as every other docker container in DATS using the /docker/.env file. Modify the corresponding variables: KUMA_*, MARIA_* and DOCKER_GROUP_ID.

2. Start Uptime Kuma

docker compose -f compose.kuma.yml up --wait

3. Configuration

Now, it is necessary to set up the monitoring manually.

  • View http://localhost:<KUMA_EXPOSED>
  • Setup Monitoring

More info can be found in Kuma's Documentation.