Sir C.R.Reddy College of Engineering, Eluru Department of Information Technology Course Handout
Sir C.R.Reddy College of Engineering, Eluru Department of Information Technology Course Handout
S.No Description
1 College Vision & Mission
2 Department Vision & Mission
3 Program Educational Objectives (PEOs)
4 Program Outcomes (POs)
5 Program Specific Outcomes (PSOs)
6 JNTUK Academic Calendar
7 Department Academic Calendar
8 Course Description
9 Course Objectives
10 Course Outcomes
11 Lesson Plan
12 Evaluation Pattern
13 Timetable
14 Unit wise Questions
COLLEGE VISION
To emerge as a premier institution in the field of technical education and research in the
state and as a home for holistic development of the students and contribute to the advancement of
society and the region
COLLEGE MISSION
To provide high quality technical education through a creative balance of academic and
industry oriented learning; to create an inspiring environment of scholarship and research; to
instill high levels of academic and professional discipline; and to establish standards that
inculcate ethical and moral values that contribute to growth in career and development of society
in general.
DEPARTMENT VISION
DEPARTMENT MISSION
To make high quality professional with moral and ethical values suitable for industry and
society
Course Description
Data Mining Techniques studies algorithms and computational paradigms that allow
computers to find patterns and regularities in databases, perform prediction and forecasting, and
generally improve their performance through interaction with data. It is currently regarded as the
key element of a more general process called Knowledge Discovery that deals with extracting
useful knowledge from raw data. The knowledge discovery process includes data selection,
cleaning, coding, using different statistical and machine learning techniques, and visualization of
the generated structures. The course will cover all these issues and will illustrate the whole
process by examples. Special emphasis will be give to the Machine Learning methods as they
provide the real knowledge discovery tools. Important related technologies, as data warehousing
and on-line analytical processing (OLAP) will be also discussed.
Course Objectives
The main objectives of this course are given below:
1. Introduce basic concepts and techniques of data warehousing and data mining
2. Examine the types of the data to be mined and apply pre-processing methods on raw data
3. Discover interesting patterns, analyze supervised and unsupervised models and estimate the
accuracy of the algorithms.
Course Outcomes
Students are able to
CO No’s COs Level
Teaching
S.No Unit Description CO
Aids
Definition of data ware house, subject oriented
1 integrated, time variant, nonvolatile collection of BB CO1
data
Data modeling-data cube what is OLAP and
2 BB CO1
OLTP
3 Multi-dimensional data modeling-differences BB CO1
4 Schemas, snow flake schema BB CO1
5 Fact constellation schema BB CO1
6 Star schema BB CO1
7 OLAP operations-slice, dice BB CO1
Pivot, rollup, drilldown based on concept
8 I BB CO1
hierarchy
Architectural frame work of data ware house 3
9 BB CO1
tier architecture
Introduction of data mining and What is data
10 BB CO1
mining
kinds of data need to be mined and patterns can
11 BB CO4
be mined
Technologies and kinds of applications are
12 PPT CO1
targeted
13 Data mining functionalities BB CO1
14 Data mining issues &applications BB CO1
15 Data preprocessing, why we need preprocessing, BB CO1
16 Quality data BB, PPT CO1
17 Data cleaning, missing values, noisy data BB CO1
Binning, clustering, combined computed, human
18 BB CO1
Regression inspection
Data integration-redundancy of data core relation
19 BB CO4
analysis
20 Data transformation-smoothing, BB CO1
21 Aggregation BB CO1
22 Generalization BB CO1
II
23 Normalization-MinMax BB CO1
24 Z-score BB CO1
25 Decimal scaling BB CO1
26 Data reduction-strategies ,data cube aggregation BB CO1
27 Dimensionality reduction, BB CO1
28 Numerosity reduction BB CO1
29 Histogram, clustering, sampling BB CO1
30 Data discretization, concept hierarchy BB CO1
31 Entropy based discretization BB CO1
32 Segmentation by natural partition BB CO1
33 Basic Concepts of classification BB CO1
General Approach to solving a classification
34 BB CO3
problem
35 Different types of classification algorithm BB CO1
36 Attribute Selection Measures, Tree Pruning BB CO1
III
37 Scalability and Decision Tree Induction BB CO1
38 Visual Mining for Decision Tree Induction. BB CO1
39 Bayes classification with example BB CO3
Basic concepts and definitions of classification
40 BB CO1
and prediction
41 Frequent Item set Generation BB CO3
42 Rule Generation: BB CO1
43 Confident Based Pruning, BB CO1
44 Apriori algorithm BB CO1
Apriori algorithm with example problem
45 BB CO4
explanation
IV
46 Improving efficiency of Apriori BB CO1
47 Rule Generation in Apriori Algorithm, BB CO1
48 Compact Representation of frequent item sets BB CO1
49 FP-Growth Algorithm. BB CO1
Overview, Basics and Importance of Cluster
50 BB CO1
Analysis,
51 Clustering techniques Different Types of Clusters PPT CO1
52 V Partition based clustering BB CO1
53 Hierarchy based clustering BB CO1
54 Density based clustering BB CO1
55 Grid based clustering BB CO1
56 K-means: The Basic K-means Algorithm BB CO1
57 K-means Additional Issues BB CO3
58 Bi-secting K Means Algorithm BB CO1
Total Classes 58
PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12 PSO1 PSO2
CO1
CO2 3
CO3 3 2 2 1 2
CO4 3 2 2 1 2
Evaluation Pattern
Day/Time 09.00- 09.50- 11.00- 11.50- 01.40- 02.30- 03.20- 04.10-
09.50 10.40 11.50 12.40 02.30 03.20 04.10 05.00
Mon B-Sec. A-Sec. B-Sec. – Lab.
Tue B-Sec. A-Sec.
Wed A-Sec. – Lab. B-Sec.
Thu A-Sec. B-Sec.
Fri A-Sec.
Sat A-Sec. B-Sec.
UNIT-1
1. Write the differences between designing a data ware house and OLTP systems.
2. Explain the OLAP operations in a multi dimensional data model.
3. Discuss the motivation behind OLAP mining.
4. Explain the Architectural framework approach of a data warehouse.
5. Describe multitired architecture data ware house.
6. Describe star schema for multi dimensional data model.
7. Differentiate operational data base systems and data ware hosing.
8. Discuss Star, Snowflake and Fact Constellation Schemas with suitable examples.
9. Describe the star cubing algorithm for computing iceberg cubes
10. What is multi dimensional data model?
11. Discuss the Database Architectures for parallel processing
12. With a neat sketch explain the 3 –tier architecture of Data warehouse
13. Discuss the various basic statistical measures used in data analysis.
14. Describe why concept hierarchies are useful in data mining
15. What is multidimensional analysis? Explain with example.
16. Give generalization based mining of plan databases by divide and conquer.
UNIT-2
1. Explain need of preprocessing technique and also explain different forms of data
preprocessing briefly
2. Explain the basic methods for data cleaning
3. Explain various methods of data cleaning in detail
4. Explain about data integration and data transformation
5. Explain about data transformation by normalization.
6. Explain dimensionality reduction
7. Briefly define the major 4 types of concept hierarchies
8. Explain the Numerosity reduction techniques
9. Explain about concept hierarchy generation for categorical data.
10. Explain about Discretization and concept hierarchy generation
UNIT-3
1. Illustrate Decision Tree Induction Algorithm using proper attribute selection measures.
2. Discuss i. Naïve Bayesian classification
ii. Multilayer Feed-forward Neural Networks.
3. Compare classification and prediction methods.
4. What is decision tree induction? Discuss about different attribute selection measures
considered during decision tree construction
5. Explain the algorithm for constructing decision tree from training samples.
6. Compare the advantages and disadvantages of eager classification versus lazy
classification
7. Why naïve Bayesian classification called naïve? Briefly discuss major ideas of naïve
Bayesian classification
8. Describe back propagation algorithm for neural network based classification of data?
9. Write short notes on Bayesian classification.
10. Explain Baye’s theorem
11. What are the Bayesian classifiers? Explain about a) naïve Bayesian classification
(b) Linear and multiple regression.
12. Discuss about Bagging and Boosting techniques for improving classifier and predictor
accuracy.
13. Explain classification by case based resoning
14. Define regression.
15. Write about classifier accuracy measures in detail and also explain predicator error
measures
UNIT-4
1. What is cluster analysis? What are some typical applications of clustering? What are
some typical requirements of clustering in data mining?
2. Explain the following clustering algorithms
i. K-means ii. K-medoids iii. DB-Scan
3. Explain about cluster analysis. Explain different types of data in cluster analysis
4. Explain about different major categorization of clustering methods.
5. Explain the clustering methods
a) K-means
b) DBSCAN
6. a) Explain what is cluster analysis
b) Discuss about model based clustering methods
6. Explain multiphase hierarchical clustering using dynamic modeling.