Publication Type : Conference Paper
Publisher : International Conference on Data Science and Engineering, ICDSE 2014
Source : International Conference on Data Science and Engineering, ICDSE 2014, Institute of Electrical and Electronics Engineers Inc., p.18-23 (2014)
ISBN : 9781479968701
Keywords : cluster analysis, Cluster computing, Clustering algorithms, Clustering approach, Data mining, decision making, Distributed computer systems, Hadoop, High dimensional data, Hive, Map-reduce, Multiprocessing systems, Parallel processing systems, Pre-processing step, Projected clustering, Time and space complexity
Campus : Amritapuri
School : Department of Computer Science and Engineering, School of Engineering
Center : AI (Artificial Intelligence) and Distributed Systems
Department : Computer Science
Verified : Yes
Year : 2014
Abstract : Clustering high dimensional data is a major challenge in data mining due to the existence of inherent complexity and sparsity of the data. Projected clustering is one of the clustering approaches that determine the clusters in the subspaces of such high dimensional data. However, projected clustering within DBMS is quite computationally expensive in time and space complexity, when the volume of records is in terms of terabytes, petabytes and more. This expensive computation becomes a hurdle especially when the data clustering on transactional data is used as a preprocessing step for other tasks such as frequent decision making, efficient indexing, compression, etc. Hence, parallelizing and distributing expensive data clustering tasks becomes attractive in terms of speed-up of computation and the increased amount of memory available in a computing cluster. Inorder to achieve this, we propose a SQL-MapReduce hybrid approach for scalable projected clustering. © 2014 IEEE.
Cite this Research Publication : Sandhya Harikumar, Shyju, M., and Dr. Kaimal, M. R., “SQL-MapReduce hybrid approach towards distributed projected clustering”, in International Conference on Data Science and Engineering, ICDSE 2014, 2014, pp. 18-23