Publication Type : Journal Article
Publisher : SAGE Publications
Source : Journal of Intelligent & Fuzzy Systems
Url : https://doi.org/10.3233/jifs-220448
Campus : Nagercoil
School : School of Computing
Year : 2022
Abstract : Many websites are attempting to offer a platform for users or customers to leave their reviews and comments about the products or services in their native languages. The cross-domain adaptation (CDA) analyses sentiment across domains. The sentiment lexicon falls short resulting in issues like feature mismatch, sparsity, polarity mismatch and polysemy. In this research, an augmented sentiment dictionary is developed in our native regional language (Tamil) that intends to construct the contextual links between terms in multi-domain datasets to reduce problems like polarity mismatch, feature mismatch, and polysemy. Data from the source domain and target domain both labeled and unlabeled are used in the proposed dictionary. To be more specific, the initial dictionary uses normalised pointwise mutual information (nPMI) to derive contextual weight, whereas the final dictionary uses the value of terms across all reviews to compute the accurate rank score. Here, a deep learning model called BERT is used for sentiment classification. For cross-domain adaptation, a modified multi-layer fuzzy-based convolutional neural network (M-FCNN) is deployed. This work aims to build a single dictionary using large number of vocabularies for classifying the reviews in Tamil for several target domains. This extendible dictionary enhances the accuracy of CDA greatly when compared to existing baseline techniques and easily handles a large number of terms in different domains.
Cite this Research Publication : K. Suresh Kumar, C. Helen Sulochana, A.S. Radhamani, T. Ananth Kumar, Sentiment lexicon for cross-domain adaptation with multi-domain dataset in Indian languages enhanced with BERT classification model, Journal of Intelligent & Fuzzy Systems, SAGE Publications, 2022, https://doi.org/10.3233/jifs-220448