Skip to content

Commit 4effe26

Browse files
committed
build webui
1 parent 3ea068e commit 4effe26

File tree

12 files changed

+1104
-15
lines changed

12 files changed

+1104
-15
lines changed

.env.sample

+9-5
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,16 @@
11
# Conditional API Usage: Depending on the value of MODEL_PROVIDER, that's what will be used when run.
22
# You can mix and match; use local Ollama with OpenAI speech or use OpenAI model with local XTTS, etc.
33

4-
# Model Provider: openai or ollama
4+
# Model Provider: openai or ollama - once set if run webui can't change in ui until you stop server and restart
5+
# openai or ollama
56
MODEL_PROVIDER=ollama
67

78
# Character to use - Options: samantha, wizard, pirate, valleygirl, newscaster1920s, alien_scientist, cyberpunk, detective
89
CHARACTER_NAME=wizard
910

10-
# Text-to-Speech Provider - Options: xtts (local uses the custom character .wav) or openai (uses OpenAI TTS voice)
11+
12+
# Text-to-Speech Provider - Options: xtts (local uses the custom character .wav) or openai (uses OpenAI TTS voice) - once set if run webui can't change in ui until you stop server and restart
13+
# openai or xtts
1114
TTS_PROVIDER=xtts
1215

1316
# OpenAI TTS Voice - When TTS_PROVIDER is set to openai above, it will use the chosen voice.
@@ -20,9 +23,6 @@ OPENAI_BASE_URL=https://api.openai.com/v1/chat/completions
2023
OPENAI_TTS_URL=https://api.openai.com/v1/audio/speech
2124
OLLAMA_BASE_URL=http://localhost:11434
2225

23-
# OpenAI API Key for models and speech (replace with your actual API key)
24-
OPENAI_API_KEY=sk-proj-1111111111
25-
2626
# Models to use - llama3 works well for local usage.
2727
# OPTIONAL: For screen analysis, if MODEL_PROVIDER is ollama, llava will be used by default.
2828
# Ensure you have llava downloaded with Ollama. If OpenAI is used, gpt-4o works well.
@@ -32,6 +32,10 @@ OLLAMA_MODEL=llama3
3232
# The voice speed for XTTS only (1.0 - 1.5, default is 1.1)
3333
XTTS_SPEED=1.2
3434

35+
36+
# OpenAI API Key for models and speech (replace with your actual API key)
37+
OPENAI_API_KEY=sk-proj-1111111
38+
3539
# NOTES:
3640
# List of trigger phrases to have the model view your desktop (desktop, browser, images, etc.).
3741
# It will describe what it sees, and you can ask questions about it:

README.md

+22-10
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ Voice Chat AI is a project that allows you to interact with different AI charact
1515
- **Analyzes user mood and adjusts AI responses accordingly**: Get personalized responses based on your mood.
1616
- **You can, just by speaking, have the AI analyze your screen and chat about it**: Seamlessly integrate visual context into your conversations.
1717
- **Easy configuration through environment variables**: Customize the application to suit your preferences with minimal effort.
18+
- **WebUI or Terminal usage**: Can be ran with either
1819

1920

2021
## Installation
@@ -72,7 +73,7 @@ Voice Chat AI is a project that allows you to interact with different AI charact
7273
pip install -r cpu_requirements.txt
7374
```
7475

75-
Need to have Microsoft C++ Build Tools for TTS
76+
Need to have Microsoft C++ Build Tools on windows for TTS
7677
[Microsoft Build Tools](https://visualstudio.microsoft.com/visual-cpp-build-tools/)
7778

7879
### Download Checkpoints
@@ -166,8 +167,17 @@ XTTS_SPEED=1.2
166167

167168
Run the application:
168169

170+
Web UI
169171
```bash
170-
python app.py
172+
uvicorn app.main:app --host 0.0.0.0 --port 8000
173+
```
174+
Find on http://localhost:8000/
175+
176+
177+
CLI Only
178+
179+
```bash
180+
python cli.py
171181
```
172182

173183
### Commands
@@ -211,28 +221,30 @@ You are a wise and ancient wizard who speaks with a mystical and enchanting tone
211221
}
212222
```
213223

214-
For XTTS find a .wav voice and add it to the wizard folder and name it as wizard.wav , the voice only needs to be 6 seconds long. Running the app will automaticly find the .wav when it has the characters name and use it.
224+
For XTTS find a .wav voice and add it to the wizard folder and name it as wizard.wav , the voice only needs to be 6 seconds long. Running the app will automatically find the .wav when it has the characters name and use it. If only using Openai Speech a .wav isn't needed
215225

216226

217227
## Watch the Demos
218228

219-
GPU - 100% local - ollama llama3, xtts-v2
229+
Webui - OpenAI and Ollama
220230

221-
[![Watch the video](https://img.youtube.com/vi/WsWbYnITdCo/maxresdefault.jpg)](https://youtu.be/WsWbYnITdCo)
231+
[![Watch the video](https://img.youtube.com/vi/bgdQkzGltdk/maxresdefault.jpg)](https://youtu.be/bgdQkzGltdk)
222232

223233

224234

225-
CPU Only mode
235+
CLI
226236

227-
Alien conversation using openai gpt4o and openai speech for tts.
237+
GPU - 100% local - ollama llama3, xtts-v2
228238

229-
[![Watch the video](https://img.youtube.com/vi/d5LbRLhWa5c/maxresdefault.jpg)](https://youtu.be/d5LbRLhWa5c)
239+
[![Watch the video](https://img.youtube.com/vi/WsWbYnITdCo/maxresdefault.jpg)](https://youtu.be/WsWbYnITdCo)
230240

231241

232-
Valley girl conversation using ollama llama3, openai tts
233242

234-
[![Watch the video](https://img.youtube.com/vi/HSEFH0UnZEk/maxresdefault.jpg)](https://youtu.be/HSEFH0UnZEk)
243+
CPU Only mode CLI
235244

245+
Alien conversation using openai gpt4o and openai speech for tts.
246+
247+
[![Watch the video](https://img.youtube.com/vi/d5LbRLhWa5c/maxresdefault.jpg)](https://youtu.be/d5LbRLhWa5c)
236248

237249

238250
## License

app/__init__.py

Whitespace-only changes.

0 commit comments

Comments
 (0)