Expressive Speech Analysis for Epoch Extraction Using Zero Frequency Filtering Approach
Publication Type:Conference Paper
Source:in Proc. IEEE Tech Symposium, IIT Kharagpur, 2016 (2016)
The present work discusses the issues of epoch extraction from expressive speech signals. Epochs represent the accurate glottal closure instants in voiced speech which in turn give the accurate instants of maximum excitation of the vocal tract. Even though, there are many existing methods for epoch extraction, which provide near perfect epoch estimation from clean or neutral speech, these methods show significant drop in the epoch extraction performance for expressive speech signals. The occurrence of uncontrolled and rapid pitch variations in expressive speech signals cause degradation in the epoch extraction performance. The objective of the present work is to improve the epoch extraction performance of the speech signals with various perceptually distinct expressions compared to neutral speech using zero frequency filtering (ZFF) approach. In order to capture the rapid and uncontrolled variations in expressive speech utterances, trend removal is performed on short segments (25 ms) of the output obtained from the cascade of three zero frequency resonators (ZFR). The epoch estimation performance of the proposed method is compared with the conventional ZFF method, existing refined ZFF method proposed for expressive speech and recently proposed zero band filtering (ZBF) approach. The effectiveness of the approach is confirmed by the improved epoch identification rate and reduced miss and false alarm rates compared with that of the existing methods.
Cite this Research Publication
Related Research Publications
- Improved Method for Epoch Estimation in Telephonic Speech Signals Using Zero Frequency Filtering
- Epoch Extraction in High Pass Filtered Speech Using Hilbert Envelope.
- Epoch extraction from emotional speech
- Improved method for epoch extraction in high pass filtered speech
- Effectiveness of polarity detection for improved epoch extraction from speech