Text extraction is a crucial stage of analyzing Journal papers. Journal papers generally are in PDF format which is semi structured data. Journal papers are presented into different sections like Introduction, Methodology, Experimental setup, Result and analysis etc. so that it is easy to access information from any section as per the reader's interest. The main importance on section extraction is to find a representative subset of the data, which contains the information of the entire set. Various approaches to extract sections from research papers include stastical methods, NLP, Machine Learning etc. In this paper we present review of various extraction techniques from a PDF document. © 2017 IEEE.
cited By ; Conference of 2017 IEEE International Conference on Innovative Mechanisms for Industry Applications, ICIMIA 2017 ; Conference Date: 21 February 2017 Through 23 February 2017; Conference Code:129110
K. Jayaram and Sangeeta, K., “A review: Information extraction techniques from research papers”, in IEEE International Conference on Innovative Mechanisms for Industry Applications, ICIMIA 2017 - Proceedings, 2017, pp. 56-59.