Spectral Smoothing by Variational Mode Decomposition and its Effect on Noise and Pitch Robustness of ASR System

Publication Type : Conference Proceedings

Publisher : in proc. IEEE international conference on acoustics, speech and signal processing (ICASSP) 2018, Vancuer, Canada

Campus : Coimbatore

School : School of Engineering

Center : Computational Engineering and Networking

Verified : No

Year : 2018

A novel front-end speech parameterization technique that is robust towards ambient noise and pitch variations is proposed in this paper. In the proposed technique, the short-time magnitude spectrum obtained by discrete Fourier transform is first decomposed in several components using variational mode decomposition (VMD). For sufficiently smoothing the spectrum, the higher-order components are discarded. The smoothed spectrum is then obtained by reconstructing the spectrum using the first-two modes only. The Mel-frequency cepstral coefficients computed using the VMD-based smoothed spectra are observed to be affected less by ambient noise and pitch variations. To validate the same, an automatic speech recognition system is developed on clean speech from adult speakers and evaluated under noisy test conditions. Furthermore, experimental evaluations are also performed on another test set which consists of speech data from children to simulate large pitch differences. The experimental evaluations as well as signal domain analyses presented in this paper support these claims.

