Skip to content

Commit ec137e4

Browse files
committed
add docker-compose.yml and update readme
1 parent bcdde78 commit ec137e4

File tree

2 files changed

+56
-30
lines changed

2 files changed

+56
-30
lines changed

README.md

+35-30
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ You can run all locally, you can use openai for chat and voice, you can mix betw
2020
- **Easy configuration through environment variables**: Customize the application to suit your preferences with minimal effort.
2121
- **WebUI or Terminal usage**: Run with your preferred method , but recommend the ui as you can change characters, model providers, speech providers, voices, ect..
2222
- **HUGE selection of built in Characters**: Talk with the funniest and most insane AI characters!
23-
- **Docker Support**: Build yor own image with or without nvidia cuda. Can run on CPU only.
23+
- **Docker Support**: Prebuilt image from dockerhub or build yor own image with or without nvidia cuda. Can run on CPU only.
2424

2525

2626
https://github.com/user-attachments/assets/5581bd53-422b-4a92-9b97-7ee4ea37e09b
@@ -83,7 +83,7 @@ https://github.com/user-attachments/assets/5581bd53-422b-4a92-9b97-7ee4ea37e09b
8383
Local XTTS can run on cpu but is slow, if using a enabled cuda gpu you also might need cuDNN for using nvidia GPU https://developer.nvidia.com/cudnn and make sure `C:\Program Files\NVIDIA\CUDNN\v9.5\bin\12.6`
8484
is in system PATH or whatever version you downloaded, you can also disable cudnn in the `"C:\Users\Your-Name\AppData\Local\tts\tts_models--multilingual--multi-dataset--xtts_v2\config.json"` to `"cudnn_enable": false`, if you don't want to use it.
8585

86-
### Optional - XTTS for local voices
86+
### XTTS for local voices - Optional
8787

8888
If you are only using speech with Openai or Elevenlabs then you don't need this. To use the local TTS the first time you select XTTS the model will download and be ready to use, if your device is cuda enabled it will load into cuda if not will fall back to cpu.
8989

@@ -98,7 +98,7 @@ uvicorn app.main:app --host 0.0.0.0 --port 8000
9898
Find on http://localhost:8000/
9999

100100

101-
CLI Only
101+
CLI Only - `also works in docker`
102102

103103
```bash
104104
python cli.py
@@ -110,13 +110,24 @@ python cli.py
110110

111111
### 📄 Prerequisites
112112
1. Docker installed on your system.
113-
2. A `.env` file in the same folder as the `docker run` command. This file should contain all necessary environment variables for the application.
113+
2. A `.env` file in the same folder as the command. This file should contain all necessary environment variables for the application.
114114

115115
---
116116

117-
### Build without Nvidia Cuda - 5 GB image - Recommended
117+
## 🐳 Docker compose
118+
uncomment the lines needed in the docker-compose.yml depending on your host system, image pulls latest from dockerhub
119+
```bash
120+
docker-compose up -d
121+
```
122+
123+
### Run or Build without Nvidia Cuda - CPU mode
118124

119-
Cuda and cudnn not supported. No gpu is used and slower when using local xtts and faster-whisper. If only using Openai or Elevenlabs for voices is perfect.
125+
Cuda and cudnn not supported. No gpu is used and slower when using local xtts and faster-whisper. If only using Openai or Elevenlabs for voices is perfect. Still works with xtts but slower. First run it downloads faster whisper model 1gb for transcription
126+
127+
```bash
128+
docker pull bigsk1/voice-chat-ai:latest
129+
```
130+
or
120131

121132
```bash
122133
docker build -t voice-chat-ai -f Dockerfile.cpu .
@@ -129,9 +140,9 @@ docker run -d
129140
-e "PULSE_SERVER=/mnt/wslg/PulseServer"
130141
-v \\wsl$\Ubuntu\mnt\wslg:/mnt/wslg/
131142
--env-file .env
132-
--name voice-chat-ai-cpu
143+
--name voice-chat-ai
133144
-p 8000:8000
134-
voice-chat-ai:latest
145+
voice-chat-ai:latest # prebuilt image use bigsk1/voice-chat-ai:latest
135146
```
136147

137148
In WSL2 Ubuntu
@@ -141,12 +152,12 @@ docker run -d \
141152
-e "PULSE_SERVER=/mnt/wslg/PulseServer" \
142153
-v /mnt/wslg/:/mnt/wslg/ \
143154
--env-file .env \
144-
--name voice-chat-ai-cpu \
155+
--name voice-chat-ai \
145156
-p 8000:8000 \
146-
voice-chat-ai:latest
157+
voice-chat-ai:latest # prebuilt image use bigsk1/voice-chat-ai:latest
147158
```
148159

149-
### Docker - prebuilt large image 40 GB - Experimental!
160+
### Nvidia Cuda large image!
150161

151162
> This is for running with an Nvidia GPU and you have Nvidia toolkit and cudnn installed.
152163
@@ -204,7 +215,7 @@ docker stop voice-chat-ai-cuda
204215
docker rm voice-chat-ai-cuda
205216
```
206217

207-
## Build it yourself with Nvidia Cuda:
218+
### Build it yourself using Nvidia Cuda:
208219

209220
```bash
210221
docker build -t voice-chat-ai:cuda .
@@ -304,7 +315,7 @@ XAI_BASE_URL=https://api.x.ai/v1
304315

305316
### ElevenLabs
306317

307-
Add names and voice id's in `elevenlabs_voices.json` - in the webui you can select them in dropdown menu.
318+
Add names and voice id's in `elevenlabs_voices.json` - in the webui you can select them in dropdown menu. Add your own as shown below.
308319

309320
```json
310321
{
@@ -321,22 +332,6 @@ Add names and voice id's in `elevenlabs_voices.json` - in the webui you can sele
321332
"id": "b0uJ9TWzQss61d8f2OWX",
322333
"name": "Female - Lucy - Sweet and sensual"
323334
},
324-
{
325-
"id": "2pF3fJJNnWg1nDwUW5CW",
326-
"name": "Male - Eustis - Fast speaking"
327-
},
328-
{
329-
"id": "pgCnBQgKPGkIP8fJuita",
330-
"name": "Male - Jarvis - Tony Stark AI"
331-
},
332-
{
333-
"id": "kz8mB8WAwV9lZ0fuDqel",
334-
"name": "Male - Nigel - Mysterious intriguing"
335-
},
336-
{
337-
"id": "MMHtVLagjZxJ53v4Wj8o",
338-
"name": "Male - Paddington - British narrator"
339-
},
340335
{
341336
"id": "22FgtP4D63L7UXvnTmGf",
342337
"name": "Male - Wildebeest - Deep male voice"
@@ -441,7 +436,7 @@ Open a new terminal and run
441436
where cudnn_ops64_9.dll
442437
```
443438

444-
### Unanticipated host error
439+
### Unanticipated host error OSError 9999
445440

446441
```bash
447442
File "C:\Users\someguy\miniconda3\envs\voice-chat-ai\lib\site-packages\pyaudio\__init__.py", line 441, in __init__
@@ -451,6 +446,16 @@ OSError: [Errno -9999] Unanticipated host error
451446

452447
Make sure ffmpeg is installed and added to PATH, on windows terminal ( winget install ffmpeg ) also make sure your microphone privacy settings on windows are ok and you set the microphone to the default device. I had this issue when using bluetooth apple airpods and this solved it.
453448

449+
### OSError 9996
450+
451+
```bash
452+
ALSA lib pulse.c:242:(pulse_connect) PulseAudio: Unable to connect: Connection refused
453+
Cannot connect to server socket err = No such file or directory
454+
OSError: [Errno -9996] Invalid input device (no default output device)
455+
```
456+
457+
PulseAudio Failure: The container’s PulseAudio client can’t connect to a server (Connection refused), meaning no host PulseAudio socket is accessible. Make sure you if running docker your volume mapping is correct to the audio device on your host.
458+
454459
## Watch the Demos
455460

456461

docker-compose.yml

+21
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
version: "3.8"
2+
3+
services:
4+
voice-chat-ai:
5+
image: bigsk1/voice-chat-ai:latest
6+
container_name: voice-chat-ai
7+
environment:
8+
- PULSE_SERVER=/mnt/wslg/PulseServer # Default: WSL2 PulseAudio server (Windows CMD or WSL2 Ubuntu)
9+
# - PULSE_SERVER=unix:/tmp/pulse/native # Uncomment for native Ubuntu/Debian with PulseAudio
10+
env_file:
11+
- .env # Loads app config (e.g., TTS_PROVIDER, API keys)
12+
volumes:
13+
- \\wsl$\Ubuntu\mnt\wslg:/mnt/wslg/ # Default: WSL2 audio mount for Windows CMD with Docker Desktop
14+
# - /mnt/wslg/:/mnt/wslg/ # Uncomment for WSL2 Ubuntu (running Docker inside WSL2 distro)
15+
# - ~/.config/pulse/cookie:/root/.config/pulse/cookie:ro # Uncomment for native Ubuntu/Debian
16+
# - /run/user/1000/pulse:/tmp/pulse:ro # Uncomment and adjust UID (e.g., 1000) for native Ubuntu/Debian
17+
ports:
18+
- "8000:8000" # Expose web UI or API port
19+
restart: unless-stopped # Restart unless manually stopped
20+
tty: true # Enable CLI interactivity (e.g., cli.py)
21+
stdin_open: true # Keep STDIN open for interactive use

0 commit comments

Comments
 (0)