Course Title: 
Data Intensive Computing
Course Code: 
Year Taught: 
Postgraduate (PG)
School of Engineering

'Data Intensive Computing' is an elective course offered in M. Tech. in Computer Science and Engineering program at School of Engineering, Amrita Vishwa Vidyapeetham.

Data Intensive computing Paradigms-types, need and use - Supercomputing, Grid Computing, Cloud Computing, Many-core Computing. Parallel Programming Systems-MapReduce-Hadoop, Workflows-Swift, MPI-MPICH, OpenMP, Multi-Threading-PThreads. Job Management Systems- Batch scheduling, Light-weight Task Scheduling. Storage Systems-File Systems- EXT3, Shared File Systems -NFS, Distributed File Systems-HDFS, FusionFS, Parallel File Systems-GPFS, PVFS, Lustre, Distributed NoSQL Key/Value Stores-Casandra, MongoDB, ZHT, Relational Databases-MySQL.

Data-Intensive Computing with GPUs and databases, many-core computing era and new challenges, Case studies on open research questions in data-intensive computing.


Readings will be from published research online material.

At the end of the course the students will be able to

  Course Outcome Bloom’s Taxonomy Level
CO 1 Explain the architecture and properties of the computer systems needed to process and store large volumes of data L2
CO 2 Describe the different computational models for processing large data sets for data at rest (batch processing) L2
CO 3 Identify data parallelism to be exploited in largescale data processing problems L2
CO 4 Compare and contrast advantages and disadvantages of the modern data-centric paradigm over the compute-centric one L4
CO 5 Design experimental studies to assess the performance of data-intensive systems L6
CO 6 Implement high-performance solutions to a realworld problem and sufficiently provide rationalizations to the design decisions and case studies L3