Back close

Intrinsic Evaluation for English–Tamil Bilingual Word Embeddings: Proceedings of ISTA 2018

Publication Type : Conference Paper

Publisher : Advances in Intelligent Systems and Computing

Source : Advances in Intelligent Systems and Computing (2020)

ISBN : 9789811360947

Campus : Coimbatore

School : School of Engineering

Center : Computational Engineering and Networking

Department : Electronics and Communication

Year : 2020

Abstract : Despite the growth of bilingual word embeddings, there is no work done so far, for directly evaluating them for English–Tamil language pair. In this paper, we present dataset and the evaluation paradigm for English–Tamil bilingual language pair. This dataset contains words that cover a range of concepts that occur in natural language. The dataset is scored based on the similarity rather than association or relatedness. Hence, the word pairs that are associated but not literally similar have a low rating. The measures are quantified further to ensure consistency in the dataset, mimicking the cognitive phenomena. Henceforth, the dataset can be used by non-native speakers, with minimal effort. We also present some inferences and insights into the semantics captured by word vectors and human cognition

Cite this Research Publication : S. Jp, Menon, V., Sankaravelayuthan, R., Dr. Soman K. P., and Kumar, M., “Intrinsic Evaluation for English–Tamil Bilingual Word Embeddings: Proceedings of ISTA 2018”, in Advances in Intelligent Systems and Computing, 2020.

Admissions Apply Now