Publication Type : Conference Paper
Publisher : IEEE
Source : 2019 International Conference on Intelligent Computing and Control Systems (ICCS), 2019, pp. 332-336
Url : https://ieeexplore.ieee.org/document/9065630
Campus : Amritapuri
School : School of Computing
Center : Computational Linguistics and Indic Studies
Year : 2019
Abstract : Modelling multiple documents for different applications is a major field of research due to the tremendous growth in Web data. To find the document similarity, we require clustering to determine the grouping of unlabelled data. Graph models have the capability or knowledge of capturing the structural information in texts. It organizes high dimensional data in such a way that the user can effortlessly access the desired information. In this paper, we use a hypergraph with the help of an association rule mining to model a collection of text documents and find similarity between them using a hypergraph partitioning algorithm. Here we use FP-Growth algorithm to find the association relationship which is a recursive elimination scheme. We then uses a spectral clustering algorithm which uses eigenvalues and vectors which is found out from the matrices to find similar documents. Experiment shows that this algorithm gave better clusters compared to others which commonly take higher eigenvectors.
Cite this Research Publication : N. Ramakrishnan, M. Nair J., D. Jayaprakash, H. Anantha Krishnan and S.Rani S.,"Hyper graph based clustering for document similarity using FP growth algorithm," 2019 International Conference on Intelligent Computing and Control Systems (ICCS), 2019, pp. 332-336, doi: 10.1109/ICCS45141.2019.9065630