Back close

Gender Identification of Code-Mixed Malayalam–English Data from WhatsApp

Publication Type : Conference Paper

Publisher : Springer

Source : Lecture Notes in Networks and Systems, Springer, Volume 74, p.101-109 (2019)

Url : https://www2.scopus.com/inward/record.uri?eid=2-s2.0-85067664334&doi=10.1007%2f978-981-13-7082-3_13&partnerID=40&md5=36d72318221eb66d892e19df51611870

Campus : Coimbatore

School : School of Engineering

Center : Computational Engineering and Networking

Department : Electronics and Communication

Year : 2019

Abstract : The boom in social media has been a topic of discussion among all generations of this era. It most certainly has its positives, such as real-time communication, and a platform for all to voice their opinions. There are a few shady sides to it too, such as anonymity of those communicating. Such anonymity, especially in mediums of messaging such as WhatsApp, can turn out dangerous. Here, comes the crucial role of author profiling. This paper describes the analysis of code-mixed Malayalam–English data, collected from WhatsApp, and its classification based on the basic demographic, the gender, of the author. The text has been represented as Term Frequency–Inverse Document Frequency (TFIDF) matrix and as Term Document Matrix (TDM). The classifiers used are SVM, Naive Bayes, Logistic Regression, Decision Tree, and Random Forest. © Springer Nature Singapore Pte Ltd. 2019.

Cite this Research Publication : V. R. Chacko, M. Kumar, A., and Dr. Soman K. P., “Gender Identification of Code-Mixed Malayalam–English Data from WhatsApp”, in Lecture Notes in Networks and Systems, 2019, vol. 74, pp. 101-109.

Admissions Apply Now