Skip to content
Change the repository type filter

All

    Repositories list

    • espeak-ng

      Public
      eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
      C
      GNU General Public License v3.0
      999300Updated Mar 31, 2025Mar 31, 2025
    • agents

      Public
      Build real-time multimodal AI applications 🤖🎙️📹
      Python
      Apache License 2.0
      792000Updated Mar 20, 2025Mar 20, 2025
    • agents-js

      Public
      Build realtime multimodal AI agents with Node.js
      TypeScript
      Apache License 2.0
      101000Updated Mar 18, 2025Mar 18, 2025
    • fairseq

      Public
      Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
      Python
      MIT License
      6.5k000Updated Feb 27, 2025Feb 27, 2025
    • A python package for calculating the PESQ.
      Python
      MIT License
      71000Updated Feb 22, 2025Feb 22, 2025
    • PyTSMod

      Public
      An open-source Python library for audio time-scale modification.
      Python
      GNU General Public License v3.0
      27600Updated Feb 13, 2025Feb 13, 2025
    • peft

      Public
      🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
      Python
      Apache License 2.0
      1.8k100Updated Jan 2, 2025Jan 2, 2025
    • AI powered speech denoising and enhancement
      Python
      MIT License
      2001.8k531Updated Dec 3, 2024Dec 3, 2024
    • resemble.ai API SDK
      TypeScript
      MIT License
      3921Updated Dec 2, 2024Dec 2, 2024
    • mup

      Public
      maximal update parametrization (µP)
      Jupyter Notebook
      MIT License
      99001Updated Sep 5, 2024Sep 5, 2024
    • Python
      MIT License
      0410Updated Sep 5, 2024Sep 5, 2024
    • Python
      1210Updated May 8, 2024May 8, 2024
    • aiortc

      Public
      WebRTC and ORTC implementation for Python using asyncio
      Python
      BSD 3-Clause "New" or "Revised" License
      802000Updated Mar 27, 2024Mar 27, 2024
    • aioice

      Public
      asyncio-based Interactive Connectivity Establishment (RFC 5245)
      Python
      BSD 3-Clause "New" or "Revised" License
      59000Updated Feb 15, 2024Feb 15, 2024
    • TypeScript
      2900Updated Dec 16, 2023Dec 16, 2023
    • Go
      1200Updated Nov 13, 2023Nov 13, 2023
    • Run OpenAI Whisper as a Cog model
      Python
      Apache License 2.0
      49200Updated Nov 8, 2023Nov 8, 2023
    • Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀
      Python
      Apache License 2.0
      151000Updated Oct 25, 2023Oct 25, 2023
    • A python package to analyze and compare voices with deep learning
      Python
      Apache License 2.0
      4412.9k422Updated Oct 12, 2023Oct 12, 2023
    • A Heroku buildpack for ffmpeg that always downloads the latest static build
      Shell
      MIT License
      728000Updated Aug 21, 2023Aug 21, 2023
    • g2pW

      Public
      Chinese Mandarin Grapheme-to-Phoneme Converter. 中文轉注音或拼音 (INTERSPEECH 2022)
      Python
      Apache License 2.0
      39000Updated Jul 8, 2023Jul 8, 2023
    • univnet

      Public
      Unofficial PyTorch Implementation of UnivNet Vocoder (https://arxiv.org/abs/2106.07889)
      Python
      BSD 3-Clause "New" or "Revised" License
      46000Updated May 19, 2023May 19, 2023
    • NeMo

      Public
      NeMo: a toolkit for conversational AI
      Python
      Apache License 2.0
      2.8k900Updated Jan 18, 2023Jan 18, 2023
    • Simple text to phonemes converter for multiple languages
      Python
      GNU General Public License v3.0
      1842001Updated Nov 21, 2022Nov 21, 2022
    • whisper

      Public
      Robust Speech Recognition via Large-Scale Weak Supervision
      Jupyter Notebook
      MIT License
      9.7k100Updated Oct 4, 2022Oct 4, 2022
    • Monotonic Alignment Search
      Cython
      MIT License
      149120Updated Sep 6, 2022Sep 6, 2022
    • reLaugh

      Public
      Supplementary materials of Synthesizing Personalized Non-speech Vocalization from Discrete Speech Representations
      HTML
      0100Updated Jun 25, 2022Jun 25, 2022
    • Benchmark Arabic text diacritization dataset
      Python
      MIT License
      19400Updated Oct 8, 2021Oct 8, 2021
    • Dockerfile
      7000Updated Sep 1, 2021Sep 1, 2021
    • Automatically deploy your project to GitHub Pages using GitHub Actions. This action can be configured to push your production-ready code into any branch you'd like.
      TypeScript
      MIT License
      379000Updated Aug 3, 2021Aug 3, 2021