You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+35-30
Original file line number
Diff line number
Diff line change
@@ -20,7 +20,7 @@ You can run all locally, you can use openai for chat and voice, you can mix betw
20
20
-**Easy configuration through environment variables**: Customize the application to suit your preferences with minimal effort.
21
21
-**WebUI or Terminal usage**: Run with your preferred method , but recommend the ui as you can change characters, model providers, speech providers, voices, ect..
22
22
-**HUGE selection of built in Characters**: Talk with the funniest and most insane AI characters!
23
-
-**Docker Support**: Build yor own image with or without nvidia cuda. Can run on CPU only.
23
+
-**Docker Support**: Prebuilt image from dockerhub or build yor own image with or without nvidia cuda. Can run on CPU only.
Local XTTS can run on cpu but is slow, if using a enabled cuda gpu you also might need cuDNN for using nvidia GPU https://developer.nvidia.com/cudnn and make sure `C:\Program Files\NVIDIA\CUDNN\v9.5\bin\12.6`
84
84
is in system PATH or whatever version you downloaded, you can also disable cudnn in the `"C:\Users\Your-Name\AppData\Local\tts\tts_models--multilingual--multi-dataset--xtts_v2\config.json"` to `"cudnn_enable": false`, if you don't want to use it.
85
85
86
-
### Optional - XTTS for local voices
86
+
### XTTS for local voices - Optional
87
87
88
88
If you are only using speech with Openai or Elevenlabs then you don't need this. To use the local TTS the first time you select XTTS the model will download and be ready to use, if your device is cuda enabled it will load into cuda if not will fall back to cpu.
2. A `.env` file in the same folder as the `docker run`command. This file should contain all necessary environment variables for the application.
113
+
2. A `.env` file in the same folder as the command. This file should contain all necessary environment variables for the application.
114
114
115
115
---
116
116
117
-
### Build without Nvidia Cuda - 5 GB image - Recommended
117
+
## 🐳 Docker compose
118
+
uncomment the lines needed in the docker-compose.yml depending on your host system, image pulls latest from dockerhub
119
+
```bash
120
+
docker-compose up -d
121
+
```
122
+
123
+
### Run or Build without Nvidia Cuda - CPU mode
118
124
119
-
Cuda and cudnn not supported. No gpu is used and slower when using local xtts and faster-whisper. If only using Openai or Elevenlabs for voices is perfect.
125
+
Cuda and cudnn not supported. No gpu is used and slower when using local xtts and faster-whisper. If only using Openai or Elevenlabs for voices is perfect. Still works with xtts but slower. First run it downloads faster whisper model 1gb for transcription
126
+
127
+
```bash
128
+
docker pull bigsk1/voice-chat-ai:latest
129
+
```
130
+
or
120
131
121
132
```bash
122
133
docker build -t voice-chat-ai -f Dockerfile.cpu .
@@ -129,9 +140,9 @@ docker run -d
129
140
-e "PULSE_SERVER=/mnt/wslg/PulseServer"
130
141
-v \\wsl$\Ubuntu\mnt\wslg:/mnt/wslg/
131
142
--env-file .env
132
-
--name voice-chat-ai-cpu
143
+
--name voice-chat-ai
133
144
-p 8000:8000
134
-
voice-chat-ai:latest
145
+
voice-chat-ai:latest# prebuilt image use bigsk1/voice-chat-ai:latest
135
146
```
136
147
137
148
In WSL2 Ubuntu
@@ -141,12 +152,12 @@ docker run -d \
141
152
-e "PULSE_SERVER=/mnt/wslg/PulseServer" \
142
153
-v /mnt/wslg/:/mnt/wslg/ \
143
154
--env-file .env \
144
-
--name voice-chat-ai-cpu \
155
+
--name voice-chat-ai \
145
156
-p 8000:8000 \
146
-
voice-chat-ai:latest
157
+
voice-chat-ai:latest# prebuilt image use bigsk1/voice-chat-ai:latest
147
158
```
148
159
149
-
### Docker - prebuilt large image 40 GB - Experimental!
160
+
### Nvidia Cuda large image!
150
161
151
162
> This is for running with an Nvidia GPU and you have Nvidia toolkit and cudnn installed.
Make sure ffmpeg is installed and added to PATH, on windows terminal ( winget install ffmpeg ) also make sure your microphone privacy settings on windows are ok and you set the microphone to the default device. I had this issue when using bluetooth apple airpods and this solved it.
453
448
449
+
### OSError 9996
450
+
451
+
```bash
452
+
ALSA lib pulse.c:242:(pulse_connect) PulseAudio: Unable to connect: Connection refused
453
+
Cannot connect to server socket err = No such file or directory
454
+
OSError: [Errno -9996] Invalid input device (no default output device)
455
+
```
456
+
457
+
PulseAudio Failure: The container’s PulseAudio client can’t connect to a server (Connection refused), meaning no host PulseAudio socket is accessible. Make sure you if running docker your volume mapping is correct to the audio device on your host.
0 commit comments