Back close

Course Detail

Course Name Big Data Analytics
Course Code 15CSE334
Program B. Tech. in Computer Science and Engineering
Year Taught 2019

Syllabus

Unit 1

Introduction to Big Data: Types of Digital Data-Characteristics of Data – Evolution of Big Data – Definition of Big Data – Challenges with Big Data – 3Vs of Big Data – Non Definitional traits of Big Data – Business Intelligence vs. Big Data – Data warehouse and Hadoop environment – Coexistence. Big Data Analytics: Classification of analytics – Data Science – Terminologies in Big Data – CAP Theorem – BASE Concept. NoSQL: Types of Databases – Advantages – NewSQL – SQL vs. NOSQL vs NewSQL. Introduction to Hadoop: Features – Advantages – Versions – Overview of Hadoop Eco systems – Hadoop distributions – Hadoop vs. SQL – RDBMS vs. Hadoop – Hadoop Components – Architecture – HDFS – Map Reduce: Mapper – Reducer – Combiner – Partitioner – Searching – Sorting – Compression. Hadoop 2 (YARN): Architecture – Interacting with Hadoop Eco systems.

Unit 2

No SQL databases: Mongo DB: Introduction – Features – Data types – Mongo DB Query language – CRUD operations – Arrays – Functions: Count – Sort – Limit – Skip – Aggregate – Map Reduce. Cursors – Indexes – Mongo Import – Mongo Export. Cassandra: Introduction – Features – Data types – CQLSH – Key spaces – CRUD operations – Collections – Counter – TTL – Alter commands – Import and Export – Querying System tables.

Unit 3

Hadoop Eco systems: Hive – Architecture – data type – File format – HQL – SerDe – User defined functions – Pig: Features – Anatomy – Pig on Hadoop – Pig Philosophy – Pig Latin overview – Data types – Running pig – Execution modes of Pig – HDFS commands – Relational operators – Eval Functions – Complex data type – Piggy Bank – User defined Functions – Parameter substitution – Diagnostic operator. Jasper Report: Introduction – Connecting to Mongo DB – Connecting to Cassandra – Introduction to Machine learning: Linear Regression – Clustering – Collaborative filtering – Association rule mining – Decision tree.

Text Books

  • Seema Acharya, Subhashini Chellappan, “Big Data and Analytics”, Wiley Publication, 2015.

Resources

  • Judith Hurwitz, Alan Nugent, Dr. Fern Halper, Marcia Kaufman, “Big Data for Dummies”, John Wiley & Sons, Inc., 2013.
  • Tom White, “Hadoop: The Definitive Guide”, O’Reilly Publications, 2011.
  • Kyle Banker, “Mongo DB in Action”, Manning Publications Company, 2012.
  • Russell Bradberry, Eric Blow, “Practical Cassandra A developers Approach“, Pearson Education, 2014.

DISCLAIMER: The appearance of external links on this web site does not constitute endorsement by the School of Biotechnology/Amrita Vishwa Vidyapeetham or the information, products or services contained therein. For other than authorized activities, the Amrita Vishwa Vidyapeetham does not exercise any editorial control over the information you may find at these locations. These links are provided consistent with the stated purpose of this web site.

Admissions Apply Now