COURSE NAME: Machine Learning & Data Mining
PROGRAM: MSc Bioinformatics


What is Data Mining? Motivating Challenges; The origins of data mining; Data Mining Tasks. Types of Data; Data Quality. Data Preprocessing; Measures of Similarity and Dissimilarity, Machine learning, Hypothesis, Version space, MAP, Maximum likelihood. Classification: Preliminaries; General approach to solving a classification problem; Decision tree induction; Rule-based classifier; Nearest-neighbor classifier, SVM, Artificial Neural Networks. Association Analysis: Problem Definition; Frequent Itemset generation; Rule Generation; Compact representation of frequent itemsets; Alternative methods for generating frequent item-sets,  Neural Networks, Cluster Analysis: Overview, K-means, Agglomerative hierarchical clustering, DBSCAN, Overview of Cluster Evaluation, Further Topics in Data Mining: Multidimensional analysis and descriptive mining of complex data objects; Spatial data mining; Multimedia data mining; Text mining; Mining the WWW. Outlier analysis, data mining applications; Additional themes on Data mining; Social impact of Data mining; Trends in Data mining.  Data warehouse – Difference between Operational DBs and Data warehouses – Multidimensional Data Model – Data warehouse Architecture – Data warehouse Implementation – OLAP Techniques Concepts & Disadvantages, Data Mining, Introduction Data Mining – Knowledge Discovery from Databases(KDD) Process – Data Processing for Data Mining – Data Cleaning, Integration, Transformation, Reduction – Data Mining Primitives – Data Mining Query Language,


  1. Data Mining: Concepts and Techniques (The Morgan Kaufmann Series in Data Management Systems) -- by Jiawei Han, MichelineKamber.
  2. Insight into Data Mining – Theory and Practice - K.P.Soman, ShyamDiwakar, V.Ajay, PHI, 2006.