Experimental analysis of malayalam pos tagger using epic framework in scala
Publication Type:Journal Article
Source:ARPN Journal of Engineering and Applied Sciences, Asian Research Publishing Network, Volume 11, Number 13, p.8017-8023 (2016)
In Natural Language Processing (NLP), one of the well-studiedproblems under constant exploration is part-ofspeech tagging or POS tagging or grammatical tagging. The task is to assign labels or syntactic categories such as noun, verb, adjective, adverb, preposition etc. to the words in a sentence or in an un-annotated corpus. This paper presents a simple machine learning based experimental study for POS tagging using a new structured prediction framework known as EPIC, developed in scale programming language. This paper is first of its kind to perform POS tagging in Indian Language using EPIC framework. In this framework, the corpus contains labelled Malayalam sentences in domains like health, tourism and general (news, stories). The EPIC framework uses conditional random field (CRF) for building tagged models. The framework provides several parameters to adjust and arrive at improved accuracy and thereby a better POS tagger model. The overall accuracy were calculated separately for each domains and obtained a maximum accuracy of 85.48%, 85.39%, and 87.35% for small tagged data in health, tourism and general domain.
cited By 0