Skip to content

enhance ten vad python and c interfaces with robustness, async support, and framework integration. #7

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

lincuan
Copy link

@lincuan lincuan commented May 16, 2025

Overview

This PR enhances the TEN VAD framework's Python (ten_vad.py) and C header (ten_vad.h) interfaces to improve robustness, performance, usability, and integration with the broader TEN framework. The changes maintain backward compatibility while addressing limitations and adding features for real-time, cross-platform, and multimodal conversational AI applications. Files are renamed to ten_vad_enhanced.py and ten_vad_enhanced.h to reflect improvements.

Changes

Python Interface (ten_vad_enhanced.py)

  • Cross-Platform Support: Replaced hard-coded library paths with dynamic loading using platform module, supporting Linux, Windows, and macOS. Added environment variable (TEN_VAD_LIB_PATH) for custom paths.
  • Robust Error Handling: Replaced assert with exception handling and logging in create_and_init_handler and del for production stability.
  • Enhanced Input Validation: Improved get_input_data to handle edge cases (empty arrays, non-NumPy arrays, non-contiguous memory) with clear error messages.
  • Memory Management: Added _audio_data_ref to retain audio data references, preventing garbage collection issues during C library calls.
  • Asynchronous Processing: Introduced process_async method using asyncio for low-latency, real-time applications.
  • Usability Improvements: Added comprehensive docstrings and type hints (using typing module) for better developer experience.
  • Dynamic Configuration: Added set_threshold method for runtime threshold adjustments.
  • Framework Integration: Added callback support in process and process_async for seamless integration with TEN framework components (e.g., DeepSeek, Gemini).

C Header Interface (ten_vad_enhanced.h)

  • Detailed Error Codes: Introduced ten_vad_error_t enum for specific error codes (e.g., TEN_VAD_ERROR_INVALID_PARAM, TEN_VAD_ERROR_OUT_OF_MEMORY).
  • Enhanced Documentation: Added detailed docstrings for all functions, specifying parameter constraints, error cases, and usage examples.
  • Dynamic Threshold Adjustment: Added ten_vad_set_threshold function to support runtime configuration, aligning with Python's set_threshold.
  • Callback Support: Introduced ten_vad_register_callback and ten_vad_callback_t for event-driven integration.
  • Structured Version Information: Added ten_vad_version_t and ten_vad_get_version_struct for programmatic version checking.
  • Handle Safety: Updated ten_vad_destroy documentation to clarify safe handling of NULL or repeated calls.

Motivation

The original ten_vad.py and ten_vad.h had limitations:

  • Hard-coded paths and limited error reporting restricted cross-platform use and debugging.
  • Static configuration (e.g., fixed threshold) limited flexibility in real-time scenarios.
  • Lack of async and callback support hindered performance and integration in event-driven systems.
  • Sparse documentation reduced usability for developers.

These enhancements align with the TEN framework's goals of real-time, multimodal, and low-latency conversational AI, as highlighted in community discussions (e.g., X posts on VAD sensitivity and language support).

Testing

  • Python:
    • Validated dynamic library loading on Linux (x64) and simulated Windows/macOS environments.
    • Tested input validation with edge cases (empty arrays, wrong shapes, non-int16 data).
    • Verified process_async reduces latency in mock real-time scenarios.
    • Ensured callback integration works with sample TEN framework pipelines.
  • C Header:
    • Verified header compatibility with ten_vad_enhanced.py using mock C implementations.
    • Tested function signatures for correct parameter passing.
    • Validated documentation clarity with sample C and Python usage.
  • Language Support: Manual tests with Chinese and English audio inputs to confirm compatibility.
  • Pending: Full C implementation testing (requires updates to C source files in a follow-up PR).

Additional Notes

  • Files renamed to ten_vad_enhanced.py and ten_vad_enhanced.h to avoid breaking existing integrations. Can replace original files if preferred.
  • C implementation updates for new functions (ten_vad_set_threshold, ten_vad_register_callback, ten_vad_get_version_struct) are required in a follow-up PR.
  • Unit tests for both Python and C interfaces are recommended (to be added if needed).
  • Community feedback from GitHub issues and X posts was considered to prioritize robustness, real-time performance, and language support.

Checklist

  • Code follows project style guidelines (Python PEP 8, C conventions).
  • Documentation updated with detailed docstrings and examples.
  • Backward-compatible with existing codebase.
  • C implementation updates for new functions (to be addressed in follow-up PR).
  • Unit tests for new interfaces (to be added if required).
  • Tested Python interface on Linux; Windows/macOS testing pending.
  • Tested C header with mock implementations.

lincuan added 5 commits May 16, 2025 12:11
…improvements

- add dynamic library path loading for cross-platform support (linux, windows, macos).
- replace assert with proper error handling and logging for robustness.
- enhance input validation in get\_input\_data to handle edge cases (empty arrays, non-contiguous memory).
- improve memory management by retaining audio data reference to prevent garbage collection issues.
- introduce async processing with process\_async method for real-time applications.
- add comprehensive docstrings and type hints for better usability.
- implement dynamic threshold adjustment with set\_threshold method.
- add callback support for integration with ten framework.
- maintain original functionality while improving maintainability and extensibility.
- add ten_vad_error_t enum for detailed error codes (invalid param, out of memory, etc.).
- enhance documentation with detailed parameter constraints, error cases, and usage examples.
- add ten_vad_set_threshold function for dynamic threshold adjustment.
- introduce ten_vad_register_callback for event-driven integration with ten framework.
- add ten_vad_version_t and ten_vad_get_version_struct for structured version information.
- improve handle safety documentation for ten_vad_destroy.
- maintain compatibility with existing ten_vad.py interface.
- add test_ten_vad_enhanced.py with comprehensive unit tests:
  - test initialization with valid and invalid parameters (hop_size, threshold).
  - validate process and process_async methods with valid and invalid inputs (wrong shape, type, empty arrays).
  - test set_threshold method for valid and invalid threshold values.
  - verify callback functionality in process and process_async.
  - test error handling for invalid library paths using TEN_VAD_LIB_PATH.
- ensure tests cover robustness, edge cases, and TEN framework integration.
- use unittest framework and numpy for test data generation.
- add test_ten_vad.c with comprehensive unit tests using Check framework:
  - test ten_vad_create and ten_vad_destroy with valid and invalid parameters, including repeated destruction safety.
  - validate ten_vad_process with valid and invalid inputs (NULL pointers, incorrect audio_data_length).
  - test ten_vad_set_threshold with valid and invalid threshold values.
  - verify ten_vad_register_callback triggers correctly during processing.
  - test ten_vad_get_version and ten_vad_get_version_struct for version retrieval.
- ensure tests cover robustness, edge cases, and TEN framework integration requirements.
- tests designed to validate new interfaces and error handling introduced in ten_vad_enhanced.h.
@lincuan
Copy link
Author

lincuan commented May 16, 2025

Refer to text.md for comprehensive test coverage, execution instructions, and implementation notes.

@Ziyi6
Copy link
Collaborator

Ziyi6 commented May 16, 2025

Thanks for your PR! We are currently validating the code :-)

@lincuan
Copy link
Author

lincuan commented May 17, 2025

Thanks for your PR! We are currently validating the code :-)

good morning and don’t hesitate to contact me if any issues arise during code review.

ysdede added a commit to ysdede/ten-vad that referenced this pull request May 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants