Publication Type:

Conference Paper

Source:

2014 First International Conference on Computational Systems and Communications (ICCSC) (2014)

ISBN:

9781479960132

Accession Number:

14931708

URL:

http://ieeexplore.ieee.org/xpl/downloadCitations

Keywords:

accuracy, audio documents, cepstral analysis, cepstral domain features, Content analysis, decision tree classification algorithms, Decision trees, Feature extraction, Feature selection, feature selection algorithms, Random forest, Random forest algorithm, Smoothing methods, SND system, Speech, Speech Non-speech, Speech recognition, speech/nonspeech detection performance, spoken language identification, storage space reduction, Vectors, vegetation, video documents

Abstract:

Speech/non-speech detection (SND) distinguishes between speech and non-speech segments in recorded audio and video documents. SND systems can help reduce the storage space required when only speech segments from the audio documents are required, for example content analysis, spoken language identification, etc. In this work, we experimented with the use of time domain, frequency domain and cepstral domain features for short time frames of 20 ms. size along with their mean and standard deviation for segments of size 200 ms. We then analysed if selecting a subset of the features can help improve the performance of the SND system. Towards this, we experimented with different feature selection algorithms, and observed that correlation based feature selection gave the best results. Further, we experimented with different decision tree classification algorithms, and note that random forest algorithm outperformed other decision tree algorithms. We further improved the SND system performance by smoothing the decisions over 5 segments of 200 ms. each. Our baseline system has 272 features, a classification accuracy of 94.45 % and the final system with 8 features has a classification accuracy of 97.80 %.

Cite this Research Publication

S. V. Thambi, Sreekumar, K. T., Dr. Santhosh Kumar C., and Raj, P. C. R., “Random forest algorithm for improving the performance of speech/non-speech detection”, in 2014 First International Conference on Computational Systems and Communications (ICCSC), 2014.

207
PROGRAMS
OFFERED
6
AMRITA
CAMPUSES
15
CONSTITUENT
SCHOOLS
A
GRADE BY
NAAC, MHRD
8th
RANK(INDIA):
NIRF 2018
150+
INTERNATIONAL
PARTNERS