Publication Type:

Conference Paper

Source:

Proceedings - 2014 International Conference on Data Science and Engineering, ICDSE 2014, Institute of Electrical and Electronics Engineers Inc., Cochin University of Science and Technology, p.118-123 (2014)

ISBN:

9781479968701

URL:

http://www.scopus.com/inward/record.url?eid=2-s2.0-84936804502&partnerID=40&md5=3f764f3792acc3d8ad19b26107261b85

Keywords:

Algorithms, Clustering algorithms, Concept mining, Concept-based, Document similarity, Engineering research, Graph model, Probabilistic network, Semantics, Sentence level

Abstract:

A lot of research work has been done in the area of concept mining and document similarity in past few years. But all these works were based on the statistical analysis of keywords. The major challenge in this area involves the preservation of semantics of the terms or phrases. Our paper proposes a graph model to represent the concept in the sentence level. The concept follows a triplet representation. A modified DB scan algorithm is used to cluster the extracted concepts. This cluster forms a belief network or probabilistic network. We use this network for extracting the most probable concepts in the document. In this paper we also proposes a new algorithm for document similarity. © 2014 IEEE.

Notes:

cited By 0; Conference of 2014 International Conference on Data Science and Engineering, ICDSE 2014 ; Conference Date: 26 August 2014 Through 28 August 2014; Conference Code:112595

Cite this Research Publication

G. Veena and Lekha, N. K., “A concept based clustering model for document similarity”, in Proceedings - 2014 International Conference on Data Science and Engineering, ICDSE 2014, Cochin University of Science and Technology, 2014, pp. 118-123.