Publication Type : Journal Article
Publisher : Springer Science and Business Media LLC
Source : Circuits, Systems, and Signal Processing
Url : https://doi.org/10.1007/s00034-025-03111-y
Campus : Coimbatore
School : School of Artificial Intelligence - Coimbatore
Department : Center for Computational Engineering and Networking (CEN)
Year : 2025
Abstract : This study presents a novel approach to identifying the predominant instrument in polyphonic music. By combining the strengths of convolutional neural networks (CNN) and involutional neural networks (INN) through an ensemble method, our approach achieves state-of-the-art performance while reducing computational complexity. Unlike traditional methods that rely on sliding window and aggregation strategies, our approach directly learns to recognize individual instruments from variable-length polyphonic audio. The proposed ensemble model, using soft voting, effectively uses the global frequency patterns captured by CNN and the dynamic localized features extracted by INN. Evaluations on the IRMAS dataset demonstrate that our proposed ensemble CI model achieves a 3.33% and 8% improvement in micro and macro F1 scores, respectively, over the state-of-the-art Han model. Furthermore, our CNN-based model requires only 641k trainable parameters, while the Involution-based model reduces complexity to just 7k parameters, compared to the 1446k parameters required by the Han model.
Cite this Research Publication : C. R. Lekshmi, Jishnu Teja Dandamudi, Dynamic Feature Learning with Involution and Convolution for Predominant Instrument Recognition in Polyphonic Music, Circuits, Systems, and Signal Processing, Springer Science and Business Media LLC, 2025, https://doi.org/10.1007/s00034-025-03111-y