This paper explains the overview of the shared task on “Detecting Paraphrases in Indian Languages” (DPIL) conducted at FIRE 2016. Given a pair of sentences in the same language, participants were asked to detect the semantic equivalence between sentences. This shared task was proposed for four Indian languages, namely Tamil, Malayalam, Hindi, and Punjabi. There were two subtasks given under the shared task on Detecting Paraphrase in Indian Languages. Given a pair of sentences, the subtask-1 was to classify them as paraphrases or not paraphrases. The subtask-2 was to identify whether they are paraphrases or semi-paraphrases or not paraphrases. The dataset created for the shared task has been made available online, and it is the first open-source paraphrase detection corpora for Indian languages. In this overview paper, we describe both subtasks, datasets, evaluation methods and system descriptions as well as performances of the submitted runs. © Springer International Publishing AG. 2018.
cited By 0; Conference of International Workshop on Text Processing, FIRE 2016 ; Conference Date: 7 December 2016 Through 10 December 2016; Conference Code:210099
A. M. Kumar, Singh, S., Kavirajan, B., and Soman, K. P., “Shared Task on Detecting Paraphrases in Indian Languages (DPIL): An Overview”, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 10478 LNCS, pp. 128-140, 2018.