Publication Type:

Journal Article

Source:

International Journal of Speech Technology, Volume 14, Number 3, p.147-155 (2011)

URL:

http://www.scopus.com/inward/record.url?eid=2-s2.0-80052962948&partnerID=40&md5=a77285e8eb73f824b4915ce3d7d5b3a9

Keywords:

Acoustic modeling, Gaussian Mixture Model, Hidden Markov model (HMM), Hidden Markov models, Linguistics, Multilingual, Neural networks, Phone recognition, Robust features, Speech recognition, Telephone sets

Abstract:

In this paper, we propose a technique to derive robust features for multilingual acoustic modeling using hidden Markov model-Gaussian mixture models (HMM-GMM). We achieve this by discriminatively combining the phonetic contexts of the target languages (languages in the multilingual system). Phonetic context is captured using wide temporal context of the features, and the dimensionality of the resulting feature set is reduced to suit the HMM-GMM implementation using a neural network with a bottle-neck in one of the hidden layers. The output before the non-linearity at the bottle-neck layer of the neural network is the new feature. Since the features are optimized for the target languages in the multilingual recognizer, they are referred to as Target Languages Oriented Features (TLOF). We perform our experiments for two of the most widely spoken Indian languages, Hindi and Tamil. TLOF offers significant performance improvements over both monolingual and multilingual phone recognizers using Mel frequency cepstral coefficients (MFCC). This emphasizes that TLOF can help share data across languages. It was also seen that TLOF can enhance the performance of monolingual acoustic models, compared to systems using MFCC. © 2011 Springer Science+Business Media, LLC.

Notes:

cited By (since 1996)1

Cite this Research Publication

S. C. Kumar and Mohandas, V. P., “Robust features for multilingual acoustic modeling”, International Journal of Speech Technology, vol. 14, pp. 147-155, 2011.