We use similarities with people we know already as a means to enhance the speaker verification accuracy. Motivated by this, we use cosine distance similarities with a set of reference speakers, cosine distance features (CDF), to improve the performance of speaker verification systems for clean and additive noise test conditions. We used mel frequency cepstral coefficients, power normalized cepstral coefficients, or delta spectral cepstral coef- ficients for deriving CDF. We then input CDF to a support vector machine (SVM) backend classifier (CDF-SVM). The performance of CDF-SVM was then compared with an i-vector with cosine distance scoring (i-CDS), and an i-vector with a backend SVM classifier (i-SVM) for stationary and non-stationary noises at different signal to noise ratio (SNR) levels. The experimental results show that, the CDF-SVM outperforms all other systems at high SNR and clean environments. However, in certain low SNR cases, i-CDS was found to be better. Finally, we fused the CDF-SVM with i-CDS and results show that the noise robustness of the combined system is significantly better than the individual systems for both high and low SNR levels. Index Terms: speaker verification, i-vectors, support vector machines, cosine distance features, noise robustness
K. K. George, Dr. Santhosh Kumar C., Dr. K. I. Ramachandran, and Panda, A., “Cosine Distance Features for Robust Speaker Verification”, in Interspeech 2015, Dresden, Germany, 2015.