International Conference on Speech and Signal processing (ICSSP 2014), Kollam, Kerala (2014)


The objective of the present work is to demonstrate the need for dynamically incorporating segmental durations for emotion conversion. Emotion conversion is the task of converting speech in one emotion to another. Most of the existing techniques incorporate the static variations in the prosodic parameters according to target emotion to achieve emotion conversion. The present work analyzes the segmental duration of various phonemes in a large emotion speech corpus and demonstrate the dynamic variations in the duration of various phonetic segments across emotions. The CSTR emotional speech corpus having two emotions (Angry and Happy) other than neutral and with 400 utterances per emotion for one speaker is used as the database for experimental studies. The segmental duration of the phonemes are statistically obtained by the classification and regression tree (CART) modeling of each emotion in the database.

T. T. Joy and Govind, D., “Analysis of Segmental Durations and Signaificance of Dynamic Duration Moification for Emotion Conversion”, International Conference on Speech and Signal processing (ICSSP 2014). Kollam, Kerala, 2014.