Publication Type:

Conference Paper

Source:

Proceedings of the 2016 International Conference on Data Science and Engineering, ICDSE 2016, Institute of Electrical and Electronics Engineers Inc. (2016)

ISBN:

9781509012800

URL:

https://www.scopus.com/inward/record.uri?eid=2-s2.0-85015152766&doi=10.1109%2fICDSE.2016.7823940&partnerID=40&md5=f7708f36a2d16f004bccf0d669df721c

Keywords:

Distributed and parallel computing, Distributed computer systems, Experimental analysis, K medoid clustering, Map-reduce, Mapreduce frameworks, Multiprocessing systems, Overall execution, Scalable clustering, Transmission costs

Abstract:

Distributed and Parallel computing are best alternatives for scalable clustering of huge amount of data with moderate to high dimensions, together with improved speed up. In this paper we address the problem of k-medoid clustering using MapReduce framework for distributed computing on commodity machines to evaluate its efficacy. There are mainly two issues to be tackled. The first one is, how to distribute the data for efficient clustering and the second one is, how to minimize the I/O and network cost among the machines. So, the main contributions of this paper are : (a)A map reduce methodology for distributed k-medoid clustering; (b) Reduction in the overall execution time and the overhead of data movement from one site to another leading to sub linear scaleup and speedup. This approach proves to be efficient, as the local clustering can be carried out independently from each other. Experimental analysis on millions of data using just 10 cores in parallel shows the clustering of data of size 1M × 17 requires only 4 minutes. So, such low transmission cost and low bandwidth requirement leads to improved speedup and scaleup of the distributed data. © 2016 IEEE.

Notes:

cited By 0; Conference of 2016 3rd International Conference on Data Science and Engineering, ICDSE 2016 ; Conference Date: 23 August 2016 Through 25 August 2016; Conference Code:126030

Cite this Research Publication

S.a Harikumar and Thaha, S. S., “MapReduce model for k-medoid clustering”, in Proceedings of the 2016 International Conference on Data Science and Engineering, ICDSE 2016, 2016.

207
PROGRAMS
OFFERED
6
AMRITA
CAMPUSES
15
CONSTITUENT
SCHOOLS
A
GRADE BY
NAAC, MHRD
8th
RANK(INDIA):
NIRF 2018
150+
INTERNATIONAL
PARTNERS