The proposed system shows the effectiveness of Deep Belief Network(DBN) over Gaussian Mixture model(GMM). The development of the proposed GMM-DBN system is by modeling GMM for each emotion independently using the extracted Mel frequency Cepstral Coefficient(MFCC) features from speech. The minimum distance between the distribution of features for each utterance with respect to each emotion model is derived as Bag of acoustic features(BoF) and plotted as histogram. In histogram, the count represents the number of feature distributions that are close to each emotion model. The BoF is passed in to DBN for developing train models. The effectiveness of the emotion recognition using DBN is empirically observed by increasing the Restricted Boltzmann machine(RBM) layers and further by tuning available parameters. The motivation is by testing the Classical German Speech emotion database(EmodB) with the proposed GMM-DBN system which gives the performance rate increase by 5% than the conventional MFCC-GMM system by empirical observation. Further testing of the proposed system over the recently developed simulated speech emotion database for Tamil language gives a comparable result for the emotion recognition. The effectiveness of the proposed model is empirically observed in EmodB.
M. Srikanth, Pravena, D., and Dr. Govind D., “Tamil Speech Emotion Recognition Using Deep Belief Network(DBN)”, International Symposium on Signal Processing and Intelligent Recognition Systems. pp. 319-327, 2018.