Back close

Towards Robust Speech Recognition Model Using Deep Learning

Publication Type : Conference Paper

Publisher : IEEE

Source : International Conference on Intelligent Systems for Communication, IoT and Security (ICISCoIS), Coimbatore, India, 2023, pp. 253-256, doi: 10.1109/ICISCoIS56541.2023.10100390. IEEE Xplore

Url : https://ieeexplore.ieee.org/document/10100390

Campus : Coimbatore

School : School of Computing

Year : 2023

Abstract : Automatic speech recognition (ASR) has expanded into more contexts recently due to the prevalence of smart gadgets. In a noisy setting, visual speech recognition, often known as lip reading, can be an important component of automatic speech recognition (ASR). Because it operates in silence, Visual Speech Recognition (VSR) is an integral part of Audio Visual Speech Recognition Systems (AVSR). VSR systems are used in places where there is a lot of background noise, while driving a car or using a cell phone. The VSR system utilizes the tried-and-true methods of statistics and machine learning, as well as the cutting-edge technique of deep learning. By employing an encoder-decoder attention based method, the proposed visual voice recognition system reduces the word error rate (WER) to around 2.8% on the benchmark GRID corpus and 40.1% on LRS2 corpus. Liptype and lipnet, two SOTA methods, are used to evaluate the outcomes.

Cite this Research Publication : A. Kumar, D. K. Renuka and M. C. S. Priya, "Towards Robust Speech Recognition Model Using Deep Learning," 2023 International Conference on Intelligent Systems for Communication, IoT and Security (ICISCoIS), Coimbatore, India, 2023, pp. 253-256, doi: 10.1109/ICISCoIS56541.2023.10100390. IEEE Xplore

Admissions Apply Now