Skip to content

This is the documentation of Swahili ASR model in Healthcare domain.

Notifications You must be signed in to change notification settings

Marconi-Lab/Swahili_ASR_Model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Swahili ASR Model Documentation.


Introduction

Despite the existence of more than 7000 languages in the world, a small group of just 10 languages dominate the majority of internet usage, accounting for over 70% of global internet users according to Statista one of the leading providers of market and consumer data in the world.

This leaves a significant challenge for the development of conversational AI tools for the remaining over 6900 languages, which collectively account for less than 30% of internet usage and are considered low-resource languages.

Africa is home to more than 2000 languages, a third of the world's spoken language according to The African Language Program at Harvard.

This means that individuals in these countries may be even less likely to have access to accurate and relevant healthcare information online.

Objective of this project

This project aims to develop a Kiswahili ASR (Automatic Speech Recognition) model to contribute in solving patient-doctor consultations (conversations) documentation.


Why it matters

  1. Africa has the highest burden of disease according to a report in 2019.

  2. Most of the healthcare system in Africa is often overwhelmed and underfunded.

Conversational AI tools can be used for tasks such as:

  • Symptom checking
  • Disease diagnosis
  • Treatment recommendations
  • Robust documentation

The negative health outcomes of a lack of representation of all languages in the digital space include:

  • A lack of accurate diagnosis
  • Inadequate treatment
  • Missing out on important medical information

Data we used to fine-tune the ASR

We used an open source swahili dataset from Common Voice website that is available on hugging face dataset hub


Fine-tuned Model

We hosted the model on Hugging Face Hub. You can upload a swahili clip from your files, or record from the browser to get a transcription. (You may experience some errors in the transcription. We are working to make the model smarter)


Documentation

  1. Notebooks on this repo:
  1. write-ups that accompany this work:

About

This is the documentation of Swahili ASR model in Healthcare domain.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published