You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Pipecat chops up LLM generations send to TTS by punctuation to reduce latency. This creates problems when it feeds TTS a short utterance like Ok., which often causes it to SCREAM randomly, and at best, have a completely different tonality. Short TTS messages give highly variable results, and destroy immersion.
If the TTS was optionally able to access the context, then it could take the most recent message(s) it had generated and populate that in the previous_text parameter to enable far smoother TTS generations.
The text was updated successfully, but these errors were encountered:
I haven't had the TTS respond with a scream, but I have noticed that single words are over-emphasized. I think this is an issue with the model itself and is something the 11Labs team should improve.
If the first word of a response is short, then there is no contextual information to provide before it as part of previous_text. If you provide other information from the context, that's just a hack to override the over-emphasized response. I don't think that's a good general solution for Pipecat though.
I'm implementing previous_text as recommended by the 11Labs team: #1600.
I've also asked the 11Labs team for tips on how to make single word inputs fit the sentence context.
Pipecat chops up LLM generations send to TTS by punctuation to reduce latency. This creates problems when it feeds TTS a short utterance like
Ok.
, which often causes it to SCREAM randomly, and at best, have a completely different tonality. Short TTS messages give highly variable results, and destroy immersion.If the TTS was optionally able to access the context, then it could take the most recent message(s) it had generated and populate that in the
previous_text
parameter to enable far smoother TTS generations.The text was updated successfully, but these errors were encountered: