COURSE SUMMARY
Course Title: 
Parallel and Distributed Data Management
Course Code: 
18CS627
Year Taught: 
2018
Degree: 
Postgraduate (PG)
School: 
Department of Education

'Parallel and Distributed Data Management' is a Soft Core course offered for the M. Tech. in Computer Science and Engineering program at School of Engineering, Amrita Vishwa Vidyapeetham.

Introduction: Parallel and Distributed architectures, models, complexity measures, Communication aspects, A Taxonomy of Distributed Systems - Models of computation: shared memory and message passing systems, synchronous and asynchronous systems, Global state and snapshot algorithms.

Distributed and Parallel databases : Centralized versus Distributed Systems, Parallel versus Distributed Systems, Distributed Database architectures-Shared disk, Shared nothing, Distributed Database Design – Fragmentation and Allocation, Optimization.

Query Processing and Optimization – Parallel/Distributed Sorting, Parallel/Distributed Join, Parallel/Distributed Aggregates, Network Partitions, Replication, Publish/Subscribe systems- Case study on Apache Kafka Distributed Publish/Subscribe messaging Hadoop and Map Reduce – Data storage and analysis, Design and concepts of HDFS, YARN, MapReduce workflows and Features, Setting up a Hadoop cluster.

TEXTBOOKS/REFERENCES

  1. M. Tamer Ozsu, Patrick Valduriez, Principles of Distributed Database Systems 3rd ed. 2011 Edition, Springer
  2. Silberschatz, Korth, Sudarshan, “Database system concepts”, 5th edition
  3. Dimitri P. Bertsekas and John N. Tsitsiklis, “Parallel and distributed computation : Numerical methods”,
  4. Andrew S. Tannenbaum and Maarten van Steen “Distributed Systems: Principles and Paradigms”, Second Edition, Prentice Hall, October 2006.
  5. Ajay D. Kshemkalyani and Mukesh Singhal, “Distributed Computing: Principles, Algorithms, and Systems”, Cambridge University Press, 2011.
  6. Vijay K Garg, “Elements of Distributed Computing”, Wiley-IEEE Press, , May 2002
  7. Parallel database systems: The future of high performance database systems
  8. Tom White, Hadoop-The definitive Guide, 4th edition, O’Reilly.

Evaluation Pattern:

  • Periodical 1 – 15
  • Periodical 2 – 15
  • Lab - 20
  • Project - 10
  • End Semester – 40

Upon completion of the course, the student will be able to,

  Course Outcome Bloom’s Taxonomy Level
CO 1 Describe clearly various distributed and parallel architectures, distributed and parallel databases, the concepts of Map Reduce in Hadoop architecture. Knowledge
CO 2 Implement distributed and parallel algorithms for query processing in databases Application
CO 3 Set up a distributed system, execute algorithms in distributed environment and compare with its centralized version Analyze
CO 4 Set up Hadoop distributed system, develop a map reduce version of a serial algorithm and evaluate the performance Synthesis