Swahili ASR Model Documentation.

`Introduction`

Despite the existence of more than 7000 languages in the world, a small group of just 10 languages dominate the majority of internet usage, accounting for over 70% of global internet users according to Statista one of the leading providers of market and consumer data in the world.

This leaves a significant challenge for the development of conversational AI tools for the remaining over 6900 languages, which collectively account for less than 30% of internet usage and are considered low-resource languages.

Africa is home to more than 2000 languages, a third of the world's spoken language according to The African Language Program at Harvard.

This means that individuals in these countries may be even less likely to have access to accurate and relevant healthcare information online.

Objective of this project

This project aims to develop a Kiswahili ASR (Automatic Speech Recognition) model to contribute in solving patient-doctor consultations (conversations) documentation.

`Why it matters`

Africa has the highest burden of disease according to a report in 2019.
Most of the healthcare system in Africa is often overwhelmed and underfunded.

Conversational AI tools can be used for tasks such as:

Symptom checking
Disease diagnosis
Treatment recommendations
Robust documentation

The negative health outcomes of a lack of representation of all languages in the digital space include:

A lack of accurate diagnosis
Inadequate treatment
Missing out on important medical information

`Data we used to fine-tune the ASR`

We used an open source swahili dataset from Common Voice website that is available on hugging face dataset hub

`Fine-tuned Model`

We hosted the model on Hugging Face Hub. You can upload a swahili clip from your files, or record from the browser to get a transcription. (You may experience some errors in the transcription. We are working to make the model smarter)

`Documentation`

Notebooks on this repo:

Audio pre-processing notebook and EDA are series of functions for converting a raw audio to MFCCs.
Fine_tuning_(a_pretrained_model)_for_Swahili_ASR Pre-processing the data from hugging face.
Fine_tuning_XLS_R_Wav2Vec2_with_Swahili_corpus_v1 is the initial attempt to train on Google colab, but unsuccessful due to limited computing resources.
Fine_tuning_XLS-R with swahili corpus is version 1 of fine-tuning and storing the checkpoints on drive.
Fine_tuning_XLS-R with swahili corpus is version 2 of fine-tuning and pushed our model checkpoints to Hugging Face Hub.
Real_Time_Speech_Recognition_on_Gradio is version 1 attempt to host our model on gradio.

write-ups that accompany this work:

A narrative on literature review
A second narrative on data preparation
A third narrative on model development
The overall technical report

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.ipynb_checkpoints		.ipynb_checkpoints
Audio reprocessing pipeline.ipynb		Audio reprocessing pipeline.ipynb
EDA.ipynb		EDA.ipynb
Fine_tuning_(a_pretrained_model)_for_Swahili_ASR.ipynb		Fine_tuning_(a_pretrained_model)_for_Swahili_ASR.ipynb
Fine_tuning_XLS_R_Wav2Vec2_with_Swahili_corpus.ipynb		Fine_tuning_XLS_R_Wav2Vec2_with_Swahili_corpus.ipynb
Fine_tuning_XLS_R_Wav2Vec2_with_Swahili_corpus_v1.ipynb		Fine_tuning_XLS_R_Wav2Vec2_with_Swahili_corpus_v1.ipynb
Fine_tuning_XLS_R_Wav2Vec2_with_Swahili_corpus_v2.ipynb		Fine_tuning_XLS_R_Wav2Vec2_with_Swahili_corpus_v2.ipynb
Model_Card.md		Model_Card.md
README.md		README.md
Real_Time_Speech_Recognition_on_Gradio.ipynb		Real_Time_Speech_Recognition_on_Gradio.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Swahili ASR Model Documentation.

`Introduction`

`Why it matters`

`Data we used to fine-tune the ASR`

`Fine-tuned Model`

`Documentation`

About

Releases

Packages

Languages

Marconi-Lab/Swahili_ASR_Model

Folders and files

Latest commit

History

Repository files navigation

Swahili ASR Model Documentation.

Introduction

Why it matters

Data we used to fine-tune the ASR

Fine-tuned Model

Documentation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

`Introduction`

`Why it matters`

`Data we used to fine-tune the ASR`

`Fine-tuned Model`

`Documentation`

Packages