Closed
Description
Pipecat chops up LLM generations send to TTS by punctuation to reduce latency. This creates problems when it feeds TTS a short utterance like Ok.
, which often causes it to SCREAM randomly, and at best, have a completely different tonality. Short TTS messages give highly variable results, and destroy immersion.
If the TTS was optionally able to access the context, then it could take the most recent message(s) it had generated and populate that in the previous_text
parameter to enable far smoother TTS generations.
Metadata
Metadata
Assignees
Labels
No labels