Programs
- M. Tech. in Automotive Engineering -Postgraduate
- Master of Physician Associate (M.PA) – (Medicine, Surgery) 2 Year -Postgraduate
Publication Type : Conference Paper
Publisher : Springer Nature Singapore
Source : Lecture Notes in Electrical Engineering
Url : https://doi.org/10.1007/978-981-96-9967-4_12
Campus : Bengaluru
School : School of Engineering
Department : Electronics and Communication
Year : 2025
Abstract : This work presents an advanced pipeline to first separate audio and then give a summary of the conversation. The proposed model combines SepFormer, ConvTasNet, and adaptive noise reduction techniques to isolate speech from two-speaker mixed audio, reduce background noise, and amplify the primary speaker’s voice. This hybrid approach gives better results than each of the two models used on their own, without significant increase in computational cost. Once trained, the system delivers rapid, accurate audio separation and transcription. Performance evaluation is done using standard metrics, including Signal-to-Distortion Ratio (SDR), Signal-to-Interference Ratio (SIR), and Signal-to-Artefacts Ratio (SAR) and Scale-Invariant SNR (SI-SNR) and it demonstrates the effectiveness of the proposed model. The model yields an average SDR, SIR, SAR and SI-SNR of 24.6, 24.5, 24.5 and 21.9935 respectively which shows its capability in improving speech clarity while maintaining efficiency.
Cite this Research Publication : Satvik Raghav, B. M. Vikhyath, Raja Karthikeya, S. Lalitha, Multi-speaker Speech Processing in Noisy Environments: A Hybrid Model for Source Separation and Summarization, Lecture Notes in Electrical Engineering, Springer Nature Singapore, 2025, https://doi.org/10.1007/978-981-96-9967-4_12