Nowadays, providing relevant product recommendations to customers plays an important role in retaining customers and improving their shopping experience. Recommender systems can be applied to industries such as an e-commerce, music, online radio, television, hospitality, finance and many more. It is proved over the years that a simple algorithm with a lot of data can always provide better results than a complex algorithm with an inadequate amount of data. To provide better product recommendations, retail businesses have to analyze huge amount of data. As the recommendation system has to analyze huge amount of data to provide better recommendations, it is considered as a data intensive application. Hadoop distributed cluster platform is developed by Apache Software Foundation to address the issues which are involved in designing data intensive applications. In this paper, the improved MapReduce based data preprocessing and Content based recommendation algorithms are proposed and implemented using hadoop framework. Also, graphical user interfaces are developed to interact with the recommender system. Experimental results on Amazon product co-purchasing network metadata show that Hadoop distributed cluster environment is an efficient and scalable platform for implementing large scale recommender system.
S. Saravanan, “Design of large-scale Content-based Recommender System Using Hadoop MapReduce Framework”, in 2015 Eighth International Conference on Contemporary Computing (IC3), 2015.