Publication Type:

Conference Paper

Source:

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI) (2017)

URL:

http://ieeexplore.ieee.org/document/8126051/

Keywords:

Crawlers, Focused Web crawler, Google, Large scale integration, Page rank, Search engines, Semantic similarity, Semantics, Uniform resource locators, Vertical Search Engines, Web pages

Abstract:

The main goal of focused web crawlers is to retrieve as many relevant pages as possible. However, most of the crawlers use page rank algorithm to lineup the pages in the crawler frontier. Since the page rank algorithm suffers from the drawback of “Richer get rich phenomenon”, focused crawlers often fail to retrieve the hidden relevant pages. This paper presents a novel approach for retrieving the hidden and relevant pages by combining rank and semantic similarity information. The model is validated by crawling the real web with different topics and the results are promising.

Cite this Research Publication

K. Pavani and Dr. Sajeev G. P., “A Novel Web Crawling Method for Vertical Search Engines”, in 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), 2017.

207
PROGRAMS
OFFERED
6
AMRITA
CAMPUSES
15
CONSTITUENT
SCHOOLS
A
GRADE BY
NAAC, MHRD
8th
RANK(INDIA):
NIRF 2018
150+
INTERNATIONAL
PARTNERS