Skip to content

Elevenlabs use previous_text to improve generation #1399

Closed
@danthegoodman1

Description

@danthegoodman1

Pipecat chops up LLM generations send to TTS by punctuation to reduce latency. This creates problems when it feeds TTS a short utterance like Ok., which often causes it to SCREAM randomly, and at best, have a completely different tonality. Short TTS messages give highly variable results, and destroy immersion.

If the TTS was optionally able to access the context, then it could take the most recent message(s) it had generated and populate that in the previous_text parameter to enable far smoother TTS generations.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions