Back close

Distributional semantic representation for text classification and information retrieval

Publication Type : Conference Paper

Publisher : CEUR Workshop Proceedings, CEUR-WS.

Source : CEUR Workshop Proceedings, CEUR-WS, Volume 1737, p.126-130 (2016)

Url : https://www.scopus.com/inward/record.uri?eid=2-s2.0-85006154188&partnerID=40&md5=b8726958f440de5f56c52733d2ff09ed

Keywords : Classification (of information), Distributional semantics, Document matrices, Factorization, Information Retrieval, Information retrieval systems, Matrix algebra, Nonnegative matrix factorization, Question classification, Semantic representation, Semantics, Test corpus, Test sets, Text classification, Text processing

Campus : Coimbatore

School : School of Engineering

Center : Computational Engineering and Networking

Department : Electronics and Communication

Year : 2016

Abstract : The objective of this experiment is to validate the performance of the distributional semantic representation of text in the classification (Question Classification) task and the Information Retrieval task. Followed by the distributional representation, first level classification of the questions is performed and relevant tweets with respect to the given queries are retrieved. The distributional representation of text is obtained by performing Non - Negative Matrix Factorization on top of the Document - Term Matrix in the training and test corpus. To improve the semantic representation of the text, phrases are also considered along with the words. This proposed approach achieved 80% as a F-1 measure and 0.0377 as a mean average precision against the its respective Mixed Script Information Retrieval task1 and task 2 test sets.

Cite this Research Publication : H. B. Barathi Ganesh, Dr. M. Anand Kumar, and Dr. Soman K. P., “Distributional semantic representation for text classification and information retrieval”, in CEUR Workshop Proceedings, 2016, vol. 1737, pp. 126-130.

Admissions Apply Now