0% found this document useful (0 votes)
10 views3 pages

Guidelines-Datamining-I-UGCF-DSE-CS Hons-Sem 4-Jan2024

The document outlines the syllabus for the Data Mining-I elective course for BSc(H) Computer Science, detailing five units covering topics such as data mining introduction, data pre-processing, cluster analysis, association rule mining, and classification techniques. It includes recommended textbooks and additional references, practical exercises for hands-on experience, and a project requirement for students to apply learned skills on a dataset. The course aims to provide students with a comprehensive understanding of data mining concepts and techniques.

Uploaded by

brosy812
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views3 pages

Guidelines-Datamining-I-UGCF-DSE-CS Hons-Sem 4-Jan2024

The document outlines the syllabus for the Data Mining-I elective course for BSc(H) Computer Science, detailing five units covering topics such as data mining introduction, data pre-processing, cluster analysis, association rule mining, and classification techniques. It includes recommended textbooks and additional references, practical exercises for hands-on experience, and a project requirement for students to apply learned skills on a dataset. The course aims to provide students with a comprehensive understanding of data mining concepts and techniques.

Uploaded by

brosy812
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

BSc(H) Computer Science

DISCIPLINE SPECIFIC Elective- Data Mining-I (Guidelines)


Sem IV (Jan 2024)

Sr. Units Chapter No. of


No. Hours
1 Unit 1: Introduction to Data Mining: 1.1-1.4, 2.1-2.2 8
Motivation and Challenges for data
mining, Types of data mining tasks,
Applications of data mining, Data
measurements, Data quality, Supervised
vs. unsupervised techniques
2 Unit 2: Data Pre-Processing: Data 2.3.1, 2.3.2, 2.3.3 (introduction), 2.3.4 9
aggregation, sampling, dimensionality (introduction), 2.3.5 (introduction),
reduction, feature subset selection, 2.3.6 (Binarization and Discretization
feature creation, variable transformation. of Continuous attributes), 2.3.7, 2.4.2,
2.4.3 (excluding properties)
3 Unit 3: Cluster Analysis: Basic 7.1.1, 7.1.2, 7.1.3 (well-separated and 11
concepts of clustering, measure of Density-based) 7.2 (upto Data in
similarity, types of clusters and Euclidean Space), 7.5.1, 7.5.5
clustering methods, K-means algorithm,
measures for cluster validation,
determine optimal number of clusters
4 Unit 4: Association Rule Mining: 5 (up to 5.2.2) 8
Transaction data-set, frequent itemset,
support measure, rule generation,
confidence of association rule, Apriori
algorithm, Apriori principle
5 Unit 5: Classification: Naive Bayes 3 (up to 3.3.3), 3.4 (introduction) 3.6, 9
classifier, Nearest Neighbour classifier, 4.3, 4.4
decision tree, overfitting, confusion
matrix, evaluation metrics and model
evaluation.

Text Book:
1. Tan P.N., Steinbach M, Karpatne A. and Kumar V. Introduction to Data Mining, Second
edition, Pearson, 2021.

Additional References:
1. Han J., Kamber M. and Pei J. Data Mining: Concepts and Techniques, 3 edition, 2011,
rd

Morgan Kaufmann Publishers.


2. Zaki M. J. and Meira J. Jr. Data Mining and Machine Learning: Fundamental Concepts
and Algorithms, 2 edition, Cambridge University Press, 2020.
nd

3. Aggarwal C. C. Data Mining: The Textbook, Springer, 2015


4. Insight into Data mining: Theory and Practice, Soman K. P., Diwakar Shyam, Ajay V.,
PHI 2006
Datasets may be downloaded from :
1. https://archive.ics.uci.edu/datasets
2. https://www.kaggle.com/datasets?fileType=csv
3. https://data.gov.in/
4. https://ieee-dataport.org/datasets
Suggested Practical Exercises
1. Apply data cleaning techniques on any dataset (e,g, wine dataset). Techniques may include handling
missing values, outliers, inconsistent values. A set of validation rules can be prepared based on the
dataset and validations can be performed.
2. Apply data pre-processing techniques such as standardization/normalization, transformation,
aggregation, discretization/binarization, sampling etc. on any dataset
3. Run Apriori algorithm to find frequent item sets and association rules on 2 real datasets and use
appropriate evaluation measures to compute correctness of obtained patterns
a) Use minimum support as 50% and minimum confidence as 75%
b) Use minimum support as 60% and minimum confidence as 60 %
4. Use Naive bayes, K-nearest, and Decision tree classification algorithms and build classifiers on
any two datasets. Divide the data set into training and test set. Compare the accuracy of the
different classifiers under the following situations:
I. a) Training set = 75% Test set = 25% b) Training set = 66.6% (2/3rd of total), Test set = 33.3%
II. Training set is chosen by i) hold out method ii) Random subsampling iii) Cross-Validation.
Compare the accuracy of the classifiers obtained. Data needs to be scaled to standard format.

5. Use Simple K-means algorithm for clustering on any dataset. Compare the performance of
clusters by changing the parameters involved in the algorithm. Plot MSE computed after each
iteration using a line plot for any set of parameters.
Project: Students should be promoted to take up one project on using dataset downloaded from
any of the websites given above and the dataset verified by the teacher. Preprocessing steps and
at least one data mining technique should be shown on the selected dataset. This will allow the
students to have a practical knowledge of how to apply the various skills learnt in the subject for
a single problem/project.

Prepared by:
1. Dr Anamika Gupta (Shaheed Sukhdev College of Business Studies)
2. Dr Manju Bhardwaj (Maitreyi College)
3. Dr Sarabjeet Kaur (Indraprastha College For Women)
4. Prof. Sharanjit Kaur (Acharya Narendra Dev College)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy