Back close

Word Segmentation and Sandhi Resolution on Ayurveda Classical Scriptures

Publication Type : Conference Paper

Publisher : IEEE

Source : 2023 International Conference on Inventive Computation Technologies (ICICT)

Url : https://doi.org/10.1109/icict57646.2023.10134024

Campus : Amritapuri

School : School of Computing

Department : Computer Science and Applications

Year : 2023

Abstract : Sanskrit is one of the oldest of the Indo-Aryan languages which has an abundant amount of texts in literature. One of the main sources of these Sanskrit texts is the textbooks written about the field of Ayurveda, the ancient Indian medical system. However, proper computational systems for processing this language are bare. A very challenging part of processing the Sanskrit language is the task of Sandhi resolution, which means the splitting of phonetically merged words. This study mainly focuses on this complex task of joint compound splitting and Sandhi resolution in Sanskrit specific to Ayurveda texts. This paper also surveys and compares all the existing research on Sandhi resolution in Sanskrit. In this work, an Ayurvedic dataset is compiled from some popular Ayurvedic texts, and a performance analysis of the state-of-the-art word segmentation model is carried out using this dataset.

Cite this Research Publication : Amrita Varshini E R, Jayashree Nair, Word Segmentation and Sandhi Resolution on Ayurveda Classical Scriptures, 2023 International Conference on Inventive Computation Technologies (ICICT), IEEE, 2023, https://doi.org/10.1109/icict57646.2023.10134024

Admissions Apply Now