Programs
- M. Tech. in Automotive Engineering -Postgraduate
- Online Certificate Course on Antimicrobial Stewardship and Infection Prevention and Control -Certificate
Publication Type : Conference Proceedings
Publisher : Elsevier BV
Source : Procedia Computer Science
Url : https://doi.org/10.1016/j.procs.2025.04.513
Keywords : Automatic speech recognition, Indian accent, Whisper, Word Error Rate
Campus : Bengaluru
School : School of Engineering
Department : Electronics and Communication
Year : 2025
Abstract : Despite advancements in Automatic Speech Recognition (ASR) technology, accurately transcribing Indian-accented English, remains a significant challenge. The main challenge associated with the transcription of Indian English is the lack of curated datasets covering a wide range of regional accents in the Indian sub-continent. Addressing this issue, this paper concentrates on building and testing a diverse dataset that captures the nuances of Indian-accented English, covering various regions and dialects across India. In Phase 1, data was collected from over 200 speakers, yielding 70 hours of speech data using custom-made healthcare transcripts in Telugu, Hindi, Kannada, Marathi, Tamil, and Malayalam. Phase 2 data collection include 100 hours of data from around 400 speakers, with transcripts derived from novels, newspapers, books, and online articles across Tamil Nadu, Karnataka, Andhra Pradesh, Maharashtra, Madhya Pradesh, Delhi, Assam, Manipur, and Rajasthan. The comprehensive dataset, spanning a total of 633 speakers, with around 170 hours of data was collected in two phases to understand the variations in pronunciation, intonation, and phonetic emphasis characteristic of the Indian accent. Further, the training was conducted by fine-tuning the existing Whisper ASR model to enhance its performance for Indian-accented English. Our results show that the fine-tuned Whisper-Tiny model achieved a Word Error Rate (WER) of 18.141%, Whisper-Small achieved 17.36%, and Whisper-Medium achieved 15.08%, demonstrating a significant improvement in recognizing and transcribing Indian-accented English.
Cite this Research Publication : Jaswanth Kunisetty, Pranav Ramachandrula, Sruthi S, Susmitha Vekkot, Deepa Gupta, Advancing ASR for Indian-Accented English: Dataset Creation and Whisper Fine-Tuning, Procedia Computer Science, Elsevier BV, 2025, https://doi.org/10.1016/j.procs.2025.04.513