Back close

Computational Approaches for Classifying Antimicrobial Peptides: A Comparative Analysis of BERT, Word2Vec, One-Hot Encoding, and Physicochemical Analysis

Publication Type : Journal Article

Publisher : Elsevier BV

Source : Procedia Computer Science

Url : https://doi.org/10.1016/j.procs.2025.04.560

Keywords : Antimicrobial Peptides (AMPs), Encoding, Physical Properties, Antibiotic Resistance, BERT, Word2Vec, Peptide Classification

Campus : Bengaluru

School : School of Artificial Intelligence

Year : 2025

Abstract : A major problem currently in the world is the emergence of strains of bacteria that are resistant to antibiotics. Antimicrobial peptide (AMP) based drugs represent an effective approach to tackle this issue. Recently, machine learning (ML) and neural network (NN) models have emerged as a valuable alternative for identifying potential AMPs by analyzing extensive datasets. Since ML models require numerical input, the choice of encoding method is crucial. This work investigates a comparative analysis of different encodings, such as one hot encoding, BERT, Word2Vec, and conventional physicochemical properties of peptide sequences, and explores their combination for training different ML and NN models. The results highlight that the combined approach of one-hot encoding and physicochemical properties consistently outperforms other methods, with Random Forest achieving an accuracy of 84% and a high AUC-ROC of 0.91. This demonstrates the model’s superior ability to classify AMPs effectively.

Cite this Research Publication : Advik Narendran, Anantha Hothri Inuguri, Addanki Ranga Ravindra, Hemanth Saga, Vasavi C.S., Ritesh Raj, Karthikeyan B., Computational Approaches for Classifying Antimicrobial Peptides: A Comparative Analysis of BERT, Word2Vec, One-Hot Encoding, and Physicochemical Analysis, Procedia Computer Science, Elsevier BV, 2025, https://doi.org/10.1016/j.procs.2025.04.560

Admissions Apply Now