0% found this document useful (0 votes)
13 views7 pages

DM 2019

Du Data mining PYQ

Uploaded by

Ayush
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
13 views7 pages

DM 2019

Du Data mining PYQ

Uploaded by

Ayush
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 7
en ‘This question paper contains 7 printed pages) Roll No. §. No. of Question Paper = 2780 ‘Unique Paper Code : 32347611 ic Name of the Paper + Data Mining, ‘Name of the Course : B.Sc. (H) Computer Science : DSE-4 Semester ak ‘Duration : 3 Hours Maximum Marks : 75 (write your Roll No. on the z snmediately on receipt of this question PAPE J Attempt All questions from Section A. ‘Attempt any four questions from Section B. Section A 1 0 6@~«~Find the Euclidean distance between data points x, -1, 0, 1) and YO, 0, -} 0). 2 (b) Wrecall and precision are 0.5 and 0.6 respectively, compute the value of F, measure: 2 ( ma given dataset, it is found that an jtemset {ab} is infrequent. Will itemset {abe} be infrequent or frequent ? Explain why- 2 PTO. @ oO (2) @), @ (2) 2780 What are the three strategies for handling missing values jn a dataset ? E 3 pifferentiate petween precision and bias 0 the basis of the quality of the measurement process: a what is meant by variable transformation 9 What are its advantages 2 3 Af support of an association rule X 7 ¥ is 80% and confidence js 75%, cam we derive support and confidence of the rule yok? if yes, list down the values. If nO, state the reason. 3 List down wo advantages and nwo disadvantages of leave- one-out approach used in cross-validation for evaluating the performance of the classifier 7 4 Differentiate petween agglomerative and divisive methods of hierarchical clustering with the help of a diagram. 4 What are asymmetric. attributes 9 Give an example of each + 4 wo asymmetric binary attribute, Gi) asymmetric diserete attribute, (ii) asymmetric continuous attribute. (®) () © (3) 2780 The confusion matrix for a 2-class problem is given 5 Calculate the Accuracy, Sensitivity, Specificity, True Positive Rate, and False Positive rate. Section B What are the differences between noise and outliers ? ‘Are noise objects always outliers ? Are outliers always noise objects ? +141 Let A and B be two sets of integers. A distance measure ‘d@ is defined as follows = 4 (A —B) = size (A — B) + size (B— A) where ‘-” denotes set difference. Size denotes the number of elements in the set. Prove that the distance measure ‘d is a metric. ‘What is unsupervised learning ? Explain with the help of an example application. 2 PTO. ienseeiiemmens: “6 — 7 (4) 2780 4 @ Consider the following dataset for a 2-class problem : 7 qi) Gio) Calculate the gain in the Gini Index when Splitting on A and B. Which attribute would the decision tree induction algorithm choose 2 Draw the decision tree after splitting showing the number of instances of each class. CC (6) 4. @ ® © 5. (@) (5) 2780 (iv) How many instances are misclassified by the resulting decision tree ? Why is K-nearest neighbor classifier a lazy learner ? 3 What is an exhaustive rule-sets in Rule based classi- fication ? If the rule-set is not exhaustive, what problem arises ? How is it resolved 7 4 What is progressive sampling ? What are its advantages ? 7 State Bayes" theorem. What assumption is used by the Naive Bayes classifier ? 3 Consider the following set of frequent 3-itemsets : (1, 2, 3}, 41, 2, 4}, {1, 2, 5} 41, 3,4}, €1, 3, 5}, (2, 3,43, 42.3, 5}, 43, 4, 5}. Assume that there are only five items in the dataset. () List all candidate 4-itemsets obtained by a candidate generation procedure using the F,_, x F, merging strategy. (ii) Listall candidate iter ‘obtained by a candidate generation procedure in Apriori. 6 eS (6) 2780 (6) Let X denotes the categorical attribute having values {awful, poor, OK, good}. What is the representation of each value when X is converted to binary form using : @ 2 bits (i) 4 bits 2 = 4 6 Consider the following transactional dataset = 8 Transaction ID Items Bought 0001 fa, d, eh 0002 {a, 5, c, e} 0003 ta, b, a, ef 0004 fa, ¢, d, e} 0005 {b,c e} 0006 fb, d, e} 0007 fe, d} 0008, {a, b, e} 0009 {a, d, e} 0010 : {a, b, e} (b) @ (6) a (7) 2780 vw) Find out the support of itemsets fey, {5s qd, {a, dy and {b, d, e}. Are these itemsets frequent if minimum support threshold is 30% ? i) Find all the rules. generated from the 3-itemset {b. a, eh List down the strong rules among these rules if minimum confidence threshold is 60%. What is the difference petween nominal attributes and ordinal attributes 9 Give an example of each. iz Explain the following terms with reference t0 the DBSCAN clustering. algorithm * (@ Core point (i) Noise point (iid) Border point 6 Given the following dat2 points : 2, 4, 10, 12, 3, 20, 30, 11,25. ‘Assume K = 3 and initial means 2, 4, 6. Show the clusters obtained using K-means algorithm after two iterations aid show the new means for the next iteration. 4

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy