Publication Type : Journal Article
Publisher : IJCTA, (International Conference Soft Computing Systems, ICSCS-2016)
Source : IJCTA, (International Conference Soft Computing Systems, ICSCS-2016), International Science Press, Volume 8, Number 5, p.1911-1916 (2016)
Url : http://www.serialsjournals.com/serialjournalmanager/pdf/1460973361.pdf
Keywords : Cluster computing, Hadoop, MapReduce, RDD, Spark
Campus : Coimbatore
School : School of Engineering
Center : Computational Engineering and Networking
Department : Center for Computational Engineering and Networking (CEN), Computer Science
Verified : Yes
Year : 2016
Abstract : Cluster computing is an approach for storing and processing huge amount of data that is being generated. Hadoop and Spark are the two cluster computing platforms which are prominent today. Hadoop incorporates the MapReduce concept and is scalable as well as fault-tolerant. But the limitations of Hadoop paved way for another cluster computing framework named Spark. It is faster and can also manage multiple workloads due to its inmemory processing. In this paper, we discuss the underlying concepts of Hadoop and mention the limitations that led to the development of Spark. Further we give a detailed description about Spark framework and its advantages. We demonstrate a wordcount problem in both Hadoop and Spark and do a comparative study.
Cite this Research Publication : A. N., Vijay Krishna Menon, and Dr. (Col.) Kumar P. N., “Cluster Computing Paradigms – A Comparative study of Evolving Frameworks”, IJCTA, (International Conference Soft Computing Systems, ICSCS-2016), vol. 8, pp. 1911-1916, 2016.