The objective of this experiment is to validate the performance of the distributional semantic representation of text in the classification (Question Classification) task and the Information Retrieval task. Followed by the distributional representation, first level classification of the questions is performed and relevant tweets with respect to the given queries are retrieved. The distributional representation of text is obtained by performing Non - Negative Matrix Factorization on top of the Document - Term Matrix in the training and test corpus. To improve the semantic representation of the text, phrases are also considered along with the words. This proposed approach achieved 80% as a F-1 measure and 0.0377 as a mean average precision against the its respective Mixed Script Information Retrieval task1 and task 2 test sets.
cited By 0; Conference of 2016 Forum for Information Retrieval Evaluation, FIRE 2016 ; Conference Date: 7 December 2016 Through 10 December 2016; Conference Code:125007
H. B. Barathi Ganesh, Dr. M. Anand Kumar, and Dr. Soman K. P., “Distributional semantic representation for text classification and information retrieval”, in CEUR Workshop Proceedings, 2016, vol. 1737, pp. 126-130.