Back close

MFCC Based Audio Classification Using Machine Learning

Publication Type : Conference Proceedings

Publisher : IEEE

Source : 2021 12th International Conference on Computing Communication and Networking Technologies

Url :

Campus : Amritapuri

School : School of Computing

Center : Computational Bioscience

Department : Computer Science

Year : 2021

Abstract : Emotion classification is very easy to detect by any human being with noticing the change in facial appearance or tone of voice of the other person. But for any machine to understand and decode it, becomes very complex. This domain is very important and relevant in the present era as it can be used and modelled for taking feedback from the customer regarding any product or hotel etc. The idea behind creating this proposed solution was to build a machine learning model that will detect emotions from the speech of any concerned persons. The main objective for this solution is to acknowledge emotions in speech and classifying them into 8 emotions, they are unbiased, cool, ecstatic, poignant, furious, fearful, shock, and astonished. The proposed approach relies on the Mel Frequency Cepstral coefficients (MFCC) and energy of the speech signals as the core feature inputs to be taken for processing. To serve this purpose, we have used a RAVDESS database of emotional speech. One feature extraction is performed, then the so obtained feature vectors, are successively used to train different Machine Learning built classification algorithms. Those algorithms include Decision tree, Random Forest, and Support Vector Machine (SVM). Finally, from the study conducted, we were able to achieve the highest accuracy of 88.54using the random forest algorithm when compared with others.

Cite this Research Publication : Vimal, B., Surya, M., Darshan, Sridhar, V.S., Ashok, A.,” MFCC Based Audio Classification Using Machine Learning”, 2021 12th International Conference on Computing Communication and Networking Technologies, ICCCNT 2021, 2021

Admissions Apply Now