Optical Character Recognition (OCR) system aims to convert optically scanned text image to a machine editable text form. Multiple approaches to preprocessing and segmentation exist for various scripts. However, only a restricted combination of the same has been experimented on Devanagari script. This paper proposes a study which aims to explore and bring out an alternative and efficient strategy of pre-processing and segmentation in handling OCR for Devanagari scripts. Efficiency evaluation of the proposed alternative has been undertaken by subjecting it to documents with varying degree of noise severity and border artifacts. The experimental results confirm our proposition to be superior approach over other conventional methodologies to OCR system implementation for Devanagari scripts. Also described is detailed approach to conventional pre-processing involved in initial stage of OCR, including noise removal techniques, along with the other conventional approaches to segmentation. The proposed alternative has been deployed to reach character and top character segmentation level.
C. A. O. Yonghui, Liyun, X., Zhiyi, F., Hongyu, S., ZHONG, S. H. A. N. G. Q. I. N., Dr. Deepa Gupta, NAIR, L. E. E. M. A. M. A. D. H. U., Duraisamy, G., Atan, R., Naeimizaghiani, M., and , “Study of SPI framework for CMMI continuous model based on QFD”, Journal of Theoretical and Applied Information Technology, vol. 52, 2013.