0% found this document useful (0 votes)
48 views3 pages

DM QB

This document contains a question bank with modules covering various topics in data mining such as data mining components, architectures, processes, classification techniques, and issues. Specific questions involve data normalization methods, attribute selection, data reduction, concept hierarchies, and handling missing data. Other questions involve descriptive statistics, data partitioning, market basket analysis, association rule mining using the Apriori algorithm, and the frequent pattern growth algorithm.

Uploaded by

asdjfkk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views3 pages

DM QB

This document contains a question bank with modules covering various topics in data mining such as data mining components, architectures, processes, classification techniques, and issues. Specific questions involve data normalization methods, attribute selection, data reduction, concept hierarchies, and handling missing data. Other questions involve descriptive statistics, data partitioning, market basket analysis, association rule mining using the Apriori algorithm, and the frequent pattern growth algorithm.

Uploaded by

asdjfkk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Question bank

Module-1
1. Discuss component of data mining.
2. Draw and explain Data Mining Architecture.
3. Write down short note on KDD process.
4. Why we called data mining rather than knowledge mining?
5. Explain classification of Data Mining.
6. What are the issues in Data Mining explain in detail.
7. Briefly explain four schemes of integration of Data Mining to
Data Warehouse.

Module-2
1. Using the data for age:13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25,
25, 25, 25, 30, 33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 70. (a)
Use min-max normalization to transform the value. b) Use z-
score normalization to transform the value.
2. What is noise? Explain data smoothing methods as noise
removal technique to divide given data into bins of size 3 by bin
partition (equal frequency), by bin means, by bin medians and
by bin boundaries. Consider the data: 10, 2, 19, 18, 20, 18, 25,
28, 22.
3. Explain various data normalization techniques with suitable
example.
4. Describe data reduction methods and Explain any one method.
5. What is concept hierarchy and explain types of hierarchy
generation method.
6. Explain Attribute Selection methods with suitable example.
7. Define terms:1)modes 2)variance 3)standard deviation 3)
quartile.
8. Suppose that the data for analysis includes the attribute age. The age
values for the data tuples are (in increasing order) 13, 15, 16, 16, 19, 20,
20, 21, 22, 22, 25, 25, 25, 25, 30, 33,33, 35, 35, 35, 35, 36, 40, 45, 46, 52,
70.
(a) What is the mean of the data?What is the median?
(b) What is the mode of the data? Comment on the data’s modality (i.e.,
bimodal,trimodal, etc.).
(c) What is the midrange of the data?
(d) Can you find (roughly) the first quartile (Q1) and the third quartile
(Q3) of the data?
(e) Give the five-number summary of the data.
(f) Show a boxplot of the data.
(g) Use min-max normalization to transform the value35for age on to the
range [0:0,1:0].
(h) Use z-score normalization to transform the value 35 for age, where
the standard
deviation of age is 12.94 years.
(I) Use normalization by decimal scaling to transform the value 35 for
age.
9. In real-world data, tuples with missing values for some attributes are a
common occurrence. Describe various methods for handling this problem.

10. Suppose a group of 12 sales price records has been sorted as follows:
5, 10, 11, 13, 15, 35, 50, 55, 72, 92, 204, 215
Partition them into three bins by each of the following methods:
(a) equal-frequency (equidepth) partitioning
(b) equal-width partitioning
(c) clustering

Module-3
1. Briefly explain Market Basket Analysis.
2. Generate frequent item sets and generate association rules based on it using Apriori
algorithm. Minimum support is 2 and minimum confidence is 50%
3.Write a note on Association Rule Mining.
4. Explain the two measures of rule interestingness: support and
confidence.

5. State the Apriori Property. Generate large itemsets and association


rules using Apriori algorithm on the following data set with minimum
support value and minimum confidence value set as 50% and 75%
respectively
TID Items Purchased
T101 Cheese, Milk, Cookies
T102 Butter, Milk, Bread
T103 Cheese, Butter, Milk,
Bread
T104 Butter, Bread
6. Explain Frequent pattern growth algorithm with above example.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy