Publication Type : Journal Article
Source : Expert Systems with Applications (IF 8.665). page120440. ISSN0957- 4174
Url : https://www.sciencedirect.com/science/article/abs/pii/S0957417423009429
Campus : Amritapuri
School : School of Computing
Center : Computational Linguistics and Indic Studies
Year : 2023
Abstract : In this work, we propose a novel weighted distributional semantic model for unsupervised Named Entity Recognition (NER) in domain specific texts, specifically focusing on agricultural domain. Developing accurate agriculture NER models requires overcoming several challenges, including the lack of annotated data, domain-specific vocabulary, entity ambiguity, and contextual variation. The proposed approach is completely unsupervised and utilizes an extended BERT model with LDA topic modeling (
) for NER. The proposed Agricultural Named Entity Recognition (AGRONER) model, focuses on identifying six major entities, disease, soil, pathogen, pesticide, crops, and place. The existing four entities are recognized using the proposed algorithm while we utilize the AGROVOC dictionary for crops and Geocoding APIs for Place entities. Due to the absence of a benchmark dataset in the agriculture domain, we created a corpus of 30,000 sentences extracted from recognized agriculture sites. For the evaluation, we used a test corpus with 700 sentences that include 1690 entity names. The labeled entities were then manually checked to evaluate the prediction accuracy. The proposed approach presents a macro average F-measure of 80.43%, which is quite promising for an unsupervised domain specific entity labeling. We performed ablations studies, where the proposed model exhibited a relative percentage improvement of 31.56%, 26.11% F-measure when compared to BERT without LDA (
) and extended BERT without LDA (
)models, respectively. Experimental results show the efficacy of the proposed approach in labeling the named entities in an unsupervised set-up for the agricultural domain. Further, the approach can be easily extended to recognize more domain-specific entities.1
Cite this Research Publication : VeenaG,VaniKanjirangat,DeepaGupta(2023)“AGRONER: An Unsupervised Agriculture Named Entity Recognition using Weighted Distributional Semantic Model”, Published in Expert Systems with Applications (IF 8.665). page120440. ISSN0957- 4174