-
Notifications
You must be signed in to change notification settings - Fork 3
Admin Guide
Docker-compose is used to orchestrate the frontend, API, celery workers, databases and other services that are used in the Discourse Analysis Tool Suite.
0. Requirements
- Machine with NVIDIA GPU
- Docker with NVIDIA Container Toolkit
1. Clone the repository
git clone https://github.com/uhh-lt/dats.git
2. Run setup scripts
./bin/setup-envs.sh --project_name dats --port_prefix 101
./bin/setup-folders.sh
3. Start docker containers
docker compose -f compose.ollama.yml up -d
docker compose -f compose.ray.yml up -d
docker compose -f compose.yml -f compose.production.yml up --wait
4. Open DATS
Open https://localhost:10100/ in your browser
First, locate the DATS directory on the machine and navigate to the docker directory. Then, get the newest code from git:
git switch main
git pull
1. Stop all containers
docker compose -f compose.yml -f compose.production.yml down
2. Update the /docker/.env file
You have to update the /docker/.env file manually. Compare it with the .env.example file to find all differences. Then, use nano to change the .env file. Most likely, you need to update the DATS_BACKEND_DOCKER_VERSION
and DATS_FRONTEND_DOCKER_VERSION
variables to the newest version.
git diff --no-index .env.example .env
nano .env
3. Pull the newest docker containers
docker compose -f compose.yml -f compose.production.yml pull
4. Start all containers
docker compose -f compose.yml -f compose.production.yml up --wait
Now, DATS is updated to the new version. Note that you also may need to update the ray and ollama containers!
Ray only needs to run once per machine. It should always be up-to-date!
1. Stop the ray container
docker compose -f compose.ray.yml down
2. Update the /docker/.env file
You have to manually set the DATS_RAY_DOCKER_VERSION
environment variable to the newest version, for example with nano:
nano .env
3. Pull the new docker container
docker compose -f compose.ray.yml pull
4. Start Ray
docker compose -f compose.ray.yml up --wait
Now, Ray is updated to the new version. Note that ray only needs to run once per machine!
Ollama only needs to run once per machine. It should always be up-to-date! However, ollama is not developed by the DATS team and its version number does not match our DATS version. Sometimes, even if we deploy a new DATS version, the Ollama version remains unchanged.
1. Stop the Ollama container
docker compose -f compose.ollama.yml down
2. Pull the new docker container
docker compose -f compose.ollama.yml pull
3. Start Ollama
docker compose -f compose.ollama.yml up --wait
Now, Ollama is updated to the new version. Note that ollama only needs to run once per machine!
The scipt ./bin/setup-folders.sh
creates multiple foldes:
- /backend_repo - User data
- /models_cache - Cached ML models
- /ollama_cache - Cached LLMs
- /spacy_models - Cached spacy models
- /numba_cache
There are two main files to configure DATS in production mode:
/docker/.env
/backend/src/configs/production.yaml
The .env
file overrides frequently changing variables of the production.yaml
config.
It is Strongly recommended to change the following configs in .env
:
SYSTEM_USER_EMAIL
SYSTEM_USER_PASSWORD
You can find some additional configurations here. However, we do not expect these to be changed:
/docker/compose.yml
/docker/compose.production.yml
-
/docker/elasticsearch.yml
- Special Elasticsearch configuration -
/docker/nginx.conf
- Special Frontend / NGINX configuration -
/backend/src/app/preprocessing/ray_model_worker/config_gpu.yaml
- Configure ML models
We provide several scripts to automatically create backups of all databases and uploaded user data. This is the recommended backup process
1. Stop backend and frontend Ensure that the backup process cannot be interrupted by users.
docker compose -f compose.yml -f compose.production.yml stop dats-frontend dats-backend-api
2. Create backups
./bin/backup-postgres.sh
./bin/backup-repo.sh
./bin/backup-elasticsearch.sh
./bin/backup-weaviate.sh
3. Restart containers
docker compose -f compose.yml -f compose.production.yml up --wait
DATS supports SSO using OAuth2. We tested it with Authentik as the Identity Provider and Single Sign On.
We include a compose.authentik.yml
to start an Authentik instance, but you can use any service that supports OAuth2/OpenID.
This section explains the setup using Authentik.
1. Configure Authentik
- First, a new application has to be created in Authentik.
- Use
dats
as the name and slug, then choose Oauth2/OpenID Provider as the Provider Type. - Note the
Client ID
andClient secret
. You will need it in the next step. - It is important to leave the private key field empty, as Authlib does not currently support token decryption.
- Do not specify any groups. DATS does not support roles or groups.
Next, we need to find the metadata/OpenID-config-URL. In Authentik, this can be found under Applications/Provider/dats.
2. Configure DATS
- Navigate to the docker directory and open the .env file
- Fill the corresponding variables:
OIDC_CLIENT_ID
,OIDC_CLIENT_SECRET
, andOIDC_SERVER_METADATA_URL
. Also, setOIDC_ENABLED=True
.
DATS is a complex application that consists of various Docker containers that are managed with Docker Compose. A monitoring system that watches the Docker containers' state and health is important for running applications reliably in Docker and not relying on users to report outages. We use Uptime Kuma, a simple, self-hosted, UI-focused monitoring software. It is open-source and can be run as another Docker container. Kuma uses MariaDB to store its data.
1. Configure Uptime Kuma
Kuma is configured as every other docker container in DATS using the /docker/.env
file.
Modify the corresponding variables: KUMA_*
, MARIA_*
and DOCKER_GROUP_ID
.
2. Start Uptime Kuma
docker compose -f compose.kuma.yml up --wait
3. Configuration
Now, it is necessary to set up the monitoring manually.
- View
http://localhost:<KUMA_EXPOSED>
- Setup Monitoring
More info can be found in Kuma's Documentation.