Unit 1
Introduction to Big Data: Types of Digital Data-Characteristics of Data – Evolution of Big Data – Definition of Big Data – Challenges with Big Data – 3Vs of Big Data – Non Definitional traits of Big Data – Business Intelligence vs. Big Data – Data warehouse and Hadoop environment – Coexistence. Big Data Analytics: Classification of analytics – Data Science – Terminologies in Big Data – CAP Theorem – BASE Concept. NoSQL: Types of Databases – Advantages – NewSQL – SQL vs. NOSQL vs NewSQL. Introduction to Hadoop: Features – Advantages – Versions – Overview of Hadoop Eco systems – Hadoop distributions – Hadoop vs. SQL – RDBMS vs. Hadoop – Hadoop Components – Architecture – HDFS – Map Reduce: Mapper – Reducer – Combiner – Partitioner – Searching – Sorting – Compression. Hadoop 2 (YARN): Architecture – Interacting with Hadoop Eco systems.