Back close

Link-Based Clustering Algorithm for Clustering Web Documents

Publication Type : Journal Article

Publisher : Journal of Testing and Evaluation

Source : Journal of Testing and Evaluation , Volume 47, Issue 6 (2019)

Campus : Amritapuri

School : Department of Computer Science and Engineering, School of Engineering

Center : Computer Vision and Robotics

Department : Computer Science

Year : 2019

Abstract : Clustering web documents involves the use of a large amount of words to be inputted to clustering algorithms such as K-Means, Cosine Similarity, Latent Discelet Allocation, and so on. This causes the clustering process to consume much time as the number of words in each document increases. In many web documents, web links are available along with the contents; these web link texts may contain a tremendous amount of information for clustering. In our work, we show that just using the web link text alone gives better clustering efficiency than considering the whole document text. We implemented our algorithm with two benchmark datasets, and the results show that the clustering efficiency is increased by our algorithm more than the existing methods.

Cite this Research Publication : P. Ashokkumar and Dr. Don S., “Link-Based Clustering Algorithm for Clustering Web Documents”, Journal of Testing and Evaluation , vol. 47, no. 6, 2019.

Admissions Apply Now