Back close

Paraphrase Identification in Telugu Using Machine Learning

Publication Type : Conference Paper

Publisher : Advances in Intelligent Systems and Computing

Source : Advances in Intelligent Systems and Computing, Springer Verlag, Volume 750, p.499-508 (2019)

Url : https://www.scopus.com/inward/record.uri?eid=2-s2.0-85059099918&doi=10.1007%2f978-981-13-1882-5_43&partnerID=40&md5=a9eaaa6b58fbaa5fa7a780b38e4e2367

ISBN : 9789811318818

Keywords : Artificial intelligence, Big data, Classification algorithm, Cloud computing, Corpus, Count-based methods, Data mining, Factorization, Inverse problems, Iterative methods, Learning systems, Paraphrase identifications, Representation method, Singular value decomposition, Syntactics, Text processing

Campus : Bengaluru, Coimbatore

School : School of Engineering

Center : Computational Engineering and Networking

Department : Electronics and Communication

Year : 2019

Abstract : Paraphrase identification is the task of determining whether two sentences convey similar meaning or not. Here, we have chosen count-based text representation methods, such as term-document matrix and term frequency-inverse document frequency matrix, along with the distributional representation methods of singular value decomposition and non-negative matrix factorization, which is iteratively used with different word share and minimum document frequency values. With the help of the above methods, the system will be able to learn features from the representations. These learned features are then used for measuring phrase-wise similarity between two sentences. The features are given to various machine learning classification algorithms and cross-validation accuracy is obtained. The corpus for this task has been created manually from different news domains. Due to the limitation of unavailability of the parser, only a set of collected data in the corpus has been used for this task. This is a first attempt in the task of paraphrase identification in Telugu language using this approach. © 2019, Springer Nature Singapore Pte Ltd.

Cite this Research Publication : A. D. Reddy, M. Kumar, A., Dr. Soman K. P., A.H., A., and B., J., “Paraphrase Identification in Telugu Using Machine Learning”, in Advances in Intelligent Systems and Computing, 2019, vol. 750, pp. 499-508.

Admissions Apply Now