Publication Type:

Conference Paper

Source:

Proceedings of the Forum for Information Retrieval Evaluation, ACM, New York, NY, USA (2015)

ISBN:

9781450337557

URL:

http://doi.acm.org/10.1145/2824864.2824882

Keywords:

Conditional Random Fields (CRF), Named Entity Recognition (NER), Natural Language Processing (NLP), Support vector machine (SVM)

Abstract:

This paper aims at implementing Named Entity Recognition (NER) for four languages such as English, Tamil, Hindi and Malayalam. The results obtained from this work are submitted to a research evaluation workshop Forum for Information Retrieval and Evaluation (FIRE 2014). This system detects three levels of named entity tags which are referred as nested named entities. It is a multi-label problem solved using chain classifier method. In this work, Conditional Random Field (CRF) and Support Vector Machine (SVM) are used for implementing NER system. In FIRE 2014, we developed a English NER system using CRF and other NER system for Tamil, Hindi and Malayalam are based on SVM. The FIRE estimated the average precision for all the four languages as 41.93 for outermost level and 33.25 for inner level. In order to improve the performance of Indian languages, we implemented CRF based NER system for the same corpus in Tamil, Hindi and Malayalam. The average precision measure for these mentioned languages are 42.87 for outer level and 36.31 for inner level. The overall performance of the NER system improved by 2.24% for outer level and 9.20% for inner level.

Cite this Research Publication

N. Abinaya, John, N., Ganesh, B. H. B., Dr. M. Anand Kumar, and Soman, K. P., “AMRITA_CEN@FIRE-2014: Named Entity Recognition for Indian Languages Using Rich Features”, in Proceedings of the Forum for Information Retrieval Evaluation, New York, NY, USA, 2015.

207
PROGRAMS
OFFERED
5
AMRITA
CAMPUSES
15
CONSTITUENT
SCHOOLS
A
GRADE BY
NAAC, MHRD
8th
RANK(INDIA):
NIRF 2018
150+
INTERNATIONAL
PARTNERS