Back close

An Optical Character Recognition Technique for Devanagari Script Using Convolutional Neural Network and Unicode Encoding

Publication Type : Conference Paper

Publisher : Springer Nature Singapore

Source : Lecture Notes in Networks and Systems

Url : https://doi.org/10.1007/978-981-33-4305-4_14

Campus : Amritapuri

School : School of Computing

Department : Computer Science and Applications

Year : 2021

Abstract : This paper describes an optical character recognition technique to convert scanned Sanskrit text images scripted in Devanagari into digital documents. The segmentation mechanism, an adaptation from existing literature, identifies and separates upper and lower modifiers in a character. It also recognizes fused Devanagari letters. The segmented characters are fed to a convolutional neural network classifier which is trained upon a dataset with about 1.2 lakhs images belonging to 85 classes for the core part of a character. Each character from the segmentation phase is predicted and mapped to the respective Unicode representation. These Unicode values for characters are added to reconstruct the desired word. By keeping track of spaces between words and lines, a document can be reconstructed to an editable format.

Cite this Research Publication : Vamsi Krishna Kikkuri, Pavan Vemuri, Srikar Talagani, Yashwanth Thota, Jayashree Nair, An Optical Character Recognition Technique for Devanagari Script Using Convolutional Neural Network and Unicode Encoding, Lecture Notes in Networks and Systems, Springer Nature Singapore, 2021, https://doi.org/10.1007/978-981-33-4305-4_14

Admissions Apply Now