07 Apr 09:29

edwko

5695636

OuteTTS Lib v0.4 Latest

Latest

OuteTTS Lib v0.4 Release Notes

Interface Improvements

Consolidated all interface versions into a single interface.py file for centralized management
Implemented isolated model handling in separate version folders while maintaining core functionality for cross-compatibility
Added Interface Version 3 implementation to support OuteTTS v1.0 models

New Features

Smart text chunking for generating long audio clips from large text inputs
Added DAC interface code to handle OuteTTS 1.0 audio encoding and decoding
Added metadata for interface version compatibility in speaker files

Transformers Backend Patch for OuteTTS 1.0

Implemented windowed repetition penalty processor (RepetitionPenaltyLogitsProcessorPatch) for improved text generation quality
Applies penalties only to recent tokens (64-token window) rather than full context
Addresses key quality issues in speech synthesis applications
Maintains backward compatibility with standard HuggingFace interfaces

Streamlined Usage

Simplified code usage with a more modular and compact implementation:

output = interface.generate(
    config=outetts.GenerationConfig(
        text="Hello, how are you doing?",
        generation_type=outetts.GenerationType.CHUNKED,
        speaker=speaker,
        sampler_config=outetts.SamplerConfig(
            temperature=0.4
            # Additional sampler parameters
        ),
    )
)

Automatic Configuration

Added support for automatic config and model loading for v1.0 models:

# Auto-configuration approach
interface = outetts.Interface(
    config=outetts.ModelConfig.auto_config(
        model=outetts.Models.VERSION_1_0_SIZE_1B,
        backend=outetts.Backend.LLAMACPP,
        quantization=outetts.LlamaCppQuantization.FP16
    )
)

Manual configuration remains available:

# Manual configuration approach
interface = outetts.Interface(
    config=outetts.ModelConfig(
        model_path="...",
        tokenizer_path="...",
        backend=outetts.Backend.LLAMACPP,
        interface_version=outetts.InterfaceVersion.V3
    )
)

Performance and Dependencies

Improved loading times by dynamically loading only required components
Removed unused dependencies (further optimizations pending, particularly for WavTokenizer implementation)

Documentation

Full usage documentation is available at:
🔗 interface_usage.md

Assets 2

17 Jan 15:20

edwko

0.3.2

ad5ba96

OuteTTS v0.3.2

Update 0.3

Implement v2 interface with simplified structure.
Split documentation for interface v1 and interface v2.
Add compatibility for OuteTTS-0.3 1B and 500M models.
Restructure codebase for better maintainability.

Assets 2

14 Dec 10:58

edwko

0.2.3

150693a

OuteTTS v0.2.3

Release Notes v0.2.3

Split WavTokenizer into encoder (82MB) and decoder (248MB) components
[WIP] Streaming support

Assets 2

30 Nov 14:55

edwko

0.2.1

c5a75f4

OuteTTS v0.2.1

Release Notes v0.2.1

New Features and Improvements:

Support for ExLlamaV2
- Integrated support for ExLlamaV2
- Pull request: #37
Whisper Integration for Speaker Generation
- Added Whisper-based transcription for generating speakers when no transcript is provided.
- Suggested in: #28
- Now, if transcript is set to None, the text will be automatically transcribed using Whisper.
```
def create_speaker(
    self, 
    audio_path: str, 
    transcript: str = None, 
    whisper_model: str = "turbo",
    whisper_device = None
)
```

Assets 2

25 Nov 12:08

edwko

0.2.0

c2d413b

OuteTTS v0.2.0 Release

OuteTTS v0.2.0 Release Notes

Major Changes

New Model Support: Added support for OuteTTS-0.2-500M model
Speaker Management: Introduced default speaker presets for each supported language
Breaking Changes:
- Speaker files from previous versions (<0.2.0) are not compatible
- Interface usage has been significantly revised (see README.md for new implementation)

New Features

Added voice cloning guidelines and interface usage recommendations in README.md
Implemented Gradio example playground for OuteTTS-0.2-500M
Multi-language alignment support
Enhanced speaker management:
- New methods: interface.print_default_speakers() and interface.load_default_speaker(name="male_1")
- Switched from pickle to JSON format for speaker saving
- Added speaker language information in saved files
Option to load WavTokenizer from custom path (resolves issue #24)
Multiple interface version initialization in a single function

Improvements

Restructured library files for better organization
Implemented hash verification for WavTokenizer downloads (resolves issue #3)
Reworked interface for better usability
Made sounddevice optional with improved error handling for sound playback
Added data preparation examples for training

Error Handling

Added validation for audio token detection
Improved error messages for long input text and early EOS cases
Enhanced overall library error handling and feedback

How to Upgrade

Update your library via pip:
```
pip install --upgrade outetts
```

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OuteTTS Lib v0.4 Release Notes

Interface Improvements

New Features

Transformers Backend Patch for OuteTTS 1.0

Streamlined Usage

Automatic Configuration

Performance and Dependencies

Documentation

Release Notes v0.2.3

Release Notes v0.2.1

New Features and Improvements:

OuteTTS v0.2.0 Release Notes

Major Changes

New Features

Improvements

Error Handling

How to Upgrade

Releases: edwko/OuteTTS

OuteTTS Lib v0.4

OuteTTS Lib v0.4 Release Notes

Interface Improvements

New Features

Transformers Backend Patch for OuteTTS 1.0

Streamlined Usage

Automatic Configuration

Performance and Dependencies

Documentation

OuteTTS v0.3.2

OuteTTS v0.2.3

Release Notes v0.2.3

OuteTTS v0.2.1

Release Notes v0.2.1

New Features and Improvements:

OuteTTS v0.2.0 Release

OuteTTS v0.2.0 Release Notes

Major Changes

New Features

Improvements

Error Handling

How to Upgrade