Publication Type:

Conference Paper

Source:

Interspeech, Florence, Italy (2011)

URL:

http://www.amrita.edu/sites/default/files/IS110395.pdf

Abstract:

This work uses instantaneous pitch and strength of excitation along with duration of syllable-like units as the parameters for emotion conversion. Instantaneous pitch and duration of the syllable-like units of the neutral speech are modified by the prosody modification of its linear prediction (LP) residual using the instants of significant excitation. The strength of excitation is modified by scaling the Hilbert envelope (HE) of the LP residual. The target emotion speech is then synthesized using the prosody and strength modified LP residual. The pitch, duration and strength modification factors for emotion conversion are derived using the syllable-like units of initial, middle and final regions from an emotion speech database having different speakers, texts and emotions. The effectiveness of the region wise modification of source and supra segmental features over the gross level modification is confirmed by the waveforms, spectrograms and subjective evaluations.

Cite this Research Publication

Dr. Govind D., Prasanna, S. R. Mahadeva, and Yegnanarayana, B., “Neutral to Target Emotion Conversion Using Source and Suprasegmental Information.”, in Interspeech, Florence, Italy, 2011.

207
PROGRAMS
OFFERED
5
AMRITA
CAMPUSES
15
CONSTITUENT
SCHOOLS
A
GRADE BY
NAAC, MHRD
9th
RANK(INDIA):
NIRF 2017
150+
INTERNATIONAL
PARTNERS