<p>Pre-processing of document images is the most variant factor from one type of document image to another. In general, especially document images require more intensive pre-processing procedures than other type of images; one of such categories is pre-printed form images. Pre-processing of such documents is different from other type of images containing simple text and free from graphical components. This paper proposes a generic pre-processing algorithm adaptable for pre-printed application form images. The work supports specifically on problem of detection and removal of scratched words inherent in the text, since these elements are interpreted neither by humans nor by machines. The algorithm exploits the features like Euler’s number, number of connected components and area covered by holes with in a text block for detection of scratched out text blocks. The algorithm has yielded reasonably good results with an overall efficacy of around 96.5%. © 2005 - 2016 JATIT & LLS. All rights reserved.</p>
cited By 0
NaShobha Rani, Vasudev, Ta, Vineeth, Pb, and Ajith, Db, “An unsupervised classification technique for recognition of scratched and non-scratched words in pre-printed documents”, Journal of Theoretical and Applied Information Technology, vol. 86, pp. 223-231, 2016.