Back close

Recognizing Named Entities in Agriculture Documents using LDA based Topic Modelling Techniques

Publication Type : Conference Proceedings

Publisher : Elsevier Procedia Computer Science

Source : Elsevier Procedia Computer Science

Url :

Campus : Amritapuri, Bengaluru

School : School of Computing

Center : Computational Linguistics and Indic Studies

Year : 2020

Abstract : Named Entity Recognition (NER) is one of the fundamental process in Natural Language Processing applications. In this paper, we propose an Agriculture Named Entity Recognition using Topic Modelling techniques (AERTM Algorithm). In the agriculture domain, we have identified Names of Crops, Soil Types, Names of Pathogen, Crop Diseases and Fertilizers as the key entities. Our work presents a hybrid approach using the agriculture vocabulary AGROVOC and the AERTM algorithm. We used AGROVOC for identifying crop names. But it failed to identify Soil Types, Crop Diseases and Fertilizers. Hence, for those entities we propose a Latent Dirichlet Allocation (LDA) based topic modelling algorithm. These named entities can be used for creating a knowledge base which can be further used mainly in Relation Extraction systems, forums supported by various Government distinguished repositories, etc. Because of the absence of benchmark agriculture data, we tested our model using 3000 sentences extracted from reputed agriculture sites. Human evaluation of the method confirms that our approach gives an accuracy of 80%.

Cite this Research Publication : Veena Gangadharan, Deepa Gupta, Recognizing Named Entities in Agriculture Documents using LDA based Topic Modelling Techniques, Procedia Computer Science, Volume 171, 2020, Pages 1337-1345, ISSN 1877-0509,

Admissions Apply Now