Robust text extraction for automated processing of multi-lingual personal identity documents

Publication Type : Journal Article

Publisher : International Journal of Engineering and Technology

Source : International Journal of Engineering and Technology, Volume 8, Issue 2, Pages 1086-1094, 2016.

Url : https://www.scopus.com/inward/record.uri?eid=2-s2.0-84971554443&partnerID=40&md5=3b0e8922990ba91f41d1f9f99785adc7

Campus : Mysuru

School : School of Arts and Sciences

Department : Computer Science

Year : 2016

Abstract : Text extraction is a technique to extract the textual portion from non-textual background like images. It plays an important role in deciphering valuable information from images. Variation in text size, font, orientation, alignment, contrast etc. makes the task of text extraction challenging. Existing text extraction methods focus on certain regions of interest and address characteristics like noise, blur, distortion and variations in fonts makes text extraction difficult. This paper proposes a technique to extract textual characters from scanned personal identity document images. Current procedures keep track of user records manually and thus give way to inefficient practices and need for abundant time and human resources. The proposed methodology digitizes personal identity documents and eliminates the need for a large portion of the manual work involved in existing data entry and verification procedures. The proposed method has been experimented extensively with large datasets of varying sizes and image qualities. The results obtained indicate high accuracy in the extraction of important textual features from the document images.

About Amrita Vishwa Vidyapeetham

Rankings

Accreditation

Governance

Chancellor

Leadership

Press Media

Newsletters

Amritapuri
Campus

Amaravati
Campus

Bengaluru
Campus

Chennai
Campus

Coimbatore
Campus

Faridabad
Campus

Kochi
Campus

Mysuru
Campus

Nagercoil
Campus

Haridwar

Research

Centers

Patents

Publication