0% found this document useful (0 votes)
53 views6 pages

Mcqs Unit 3

The document contains a series of questions and answers related to Data Warehousing and Data Mining, focusing on concepts such as data cleaning, classification, regression, and data mining tasks. It covers various techniques, definitions, and processes involved in knowledge discovery from data. The questions are designed to test understanding of key principles and methodologies in the field.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views6 pages

Mcqs Unit 3

The document contains a series of questions and answers related to Data Warehousing and Data Mining, focusing on concepts such as data cleaning, classification, regression, and data mining tasks. It covers various techniques, definitions, and processes involved in knowledge discovery from data. The questions are designed to test understanding of key principles and methodologies in the field.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 6

Year & Semester: III & V

Subject Code & Subject Name: U14CS518 & Data Warehousing and Data Mining
Unit – III
1. Background knowledge referred to
a. Additional acquaintance used by a learning algorithm to facilitate the learning
process
b. A neural network that makes use of a hidden layer
c. It is a form of automatic learning.
d. None of these
2. The process of knowledge discovery from data is called ____________.
a. data mining c. query
b. data d. knowledge
warehouse engineering
3. The process of removing the deficiencies and loopholes in the data is called as
a. Aggregation of data c. Cleaning up of data.
b. Extracting of data d. Compression of data.
4. Which of the following is the collection of data objects that are similar to one another
within the same group?
a. Partitioning c. Cluster
b. Grid d. Table
5. Multiple Regression means
a. Data are modeled using a straight line
b. Data are modeled using a curve line
c. Extension of linear regression involving only one predicator value
d. Extension of linear regression involving more than one predicator value
6. The term ____________ refer loosely to the process of semi-automatically analyzing
large databases to find useful pattern
a. data analysis c. data mining
b. data warehouse d. knowledge discovery
7. Data selection is
a. The actual discovery phase of a knowledge discovery process
b. The stage of selecting the right data for a KDD process
c. A subject-oriented integrated time variant non-volatile collection of data in
support of management
d. None of these
8. Which of the following is/are the Data mining tasks?
a. Regression c. Clustering
b. Classification d. All of the above.
9. Concept description is the basic form of the_________
a. Predictive data mining c. Data warehouse
b. Descriptive data mining d. Relational data base
10. Which is the technique used for classification in data mining?
a. Descriptive pattern c. Decision tree classifiers
b. Associations d. Regression
11. Which of the following is not an ETL tool?
a. Informatica c. Datastage
b. Oracle warehouse builder d. Visual studio
12. Classification accuracy is
a. A subdivision of a set of examples into a number of classes
b. Measure of the accuracy, of the classification of a concept that is given by a
certain theory
1
c. The task of assigning a classification to a set of examples
d. None of these
13. A set of items that frequently appear together in a transactional data set called?
a. Frequent pattern c. Frequent itemset
b. Frequent subsequence d. Frequent substructure
14. Hidden knowledge referred to
a. A set of databases from different vendors, possibly using different database
paradigms
b. An approach to a problem that is not guaranteed to work but performs well in
most cases
c. Information that is hidden in a database and that cannot be recovered by a
simple SQL query.
d. None of these
15. ____________ deal with the prediction of value rather than a class.

a. Regression c. Recall
b. Precision d. Multiway splits
16. The following technology is not well-suited for data mining:
a. Expert system technology
b. Data visualization
c. Technology limited to specific data types such as numeric data types
d. Parallel architecture
17. Inconsistent data may comes from_________
a. Different data sources
b. Functional dependency violation
c. Both (a) & (b) d. None of the above
18. Which technique is suitable for handling the noisy data?
a. Bayesian formula c. Regression method
b. Attribute mean d. Correlation analysis
19. Which of the functions are used in each wavelet transformation
a. Smoothing, difference
b. Smoothing, Decision-tree induction
c. Correlation, Chi square
d. Binning, difference
20. Which of the following is not pattern interestingness measure?
a. Support c. Utility
b. Simplicity d. Clustering

21. The comparison the general features of software products whose sales increased by
10% in the last year with those whose sales decreased by at least 30% during the same
period, is concept of?
a. Characterization c. Classification
b. Discrimination d. Prediction
22. age(x, ”youth”) AND income(X, low) -> class(X, B)?
a. Decision tree c. Neural network
b. If-then d. All of the above

23. The association rule, buys(X; “computer”) => buys(X; “software”) ; [support = 1%;
confidence = 50%] which of the following is true?

2
a. 1% of transaction may chance to buy both & 50% of all transactions
will purchased together.
b. 50% of transaction may chance to buy both & 1% of all transactions
will purchased together
c. Both (a) & (b)
d. None of above
24. Salary=“-10”, it represents which type of data?
a. Inconsistent c. Incomplete
b. Noisy d. All of the above
25. Let x1,x2,….xN be set of N value then median of the set value is?

a.

b. ,

c.

d. None of the above

26. Which of the following formula used to specify the range of z – score normalization?
a.

b. c.

d. None of the above

27. Initial attribute set: {A1, A2, A3, A4, A5, A6} is processed by, {A1, A2, A3, A4, A5,
A6}, {A1, A3, A4, A5, A6}, {A1, A4, A5, A6}, and {A1, A4, A6} is called?
a. Step-wise forward selection
b. Step-wise backward elimination
c. Combining forward selection and backward elimination
d. Decision-tree induction
28. How to calculate multiple regression models?
a. Y = w X + b
b. Y = b0 + b1 X1 + b2 X2
c. p(a, b, c, d) = aab baccad dbcd
d. None of the above
29. In c Analysis, given Sample S is partitioned into two intervals S1 and S2, this
2

interval can merged by?


a. Unsupervised, bottom-up
b. Unsupervised, top-down
c. Supervised, top-down
d. supervised, bottom-up
30. The output of data characterization can be presented by,

3
a) Pie charts
b) Bar charts
c) Curves
d) All of the above

31. Which of the following process includes data cleaning, data integration, data
selection, data transformation, data mining, pattern evolution and knowledge
presentation?

(a) KDD process


(b) ETL process
(c) KTL process
(d) None of the above.

32. …………………. is an essential process where intelligent methods are applied to


extract data patterns.
a) Data warehousing
b) Data mining
c) Text mining
d) Data selection
33. Data mining can also applied to other forms such as …………….
i) Data streams
ii) Sequence data
iii) Networked data
iv) Text data
v) Spatial data
a) i, ii, iii and v only
b) ii, iii, iv and v only
c) i, iii, iv and v only
d) All i, ii, iii, iv and v
34. Which of the following is not a data mining functionality?
a) Characterization and Discrimination
b) Classification and regression
c) Selection and interpretation
d) Clustering and Analysis
35. ……………………….. is a summarization of the general characteristics or features
of a target class of data.
a) Data Characterization
b) Data Classification
c) Data discrimination
d) Data selection
36. ……………………….. is a comparison of the general features of the target class data
objects against the general features of objects from one or multiple contrasting classes.
a) Data Characterization
b) Data Classification
c) Data discrimination
d) Data selection
37. Strategic value of data mining is ………………….
a) cost-sensitive
b) work-sensitive
c) time-sensitive
d) technical-sensitive
4
38. ……………………….. is the process of finding a model that describes and
distinguishes data classes or concepts.
a) Data Characterization
b) Data Classification
c) Data discrimination
d) Data selection
39. The various aspects of data mining methodologies is/are ……………….
i) Mining various and new kinds of knowledge
ii) Mining knowledge in multidimensional space
iii) Pattern evaluation and pattern or constraint-guided mining.
iv) Handling uncertainty, noise, or incompleteness of data
a) i, ii and iv only
b) ii, iii and iv only
c) i, ii and iii only
d) All i, ii, iii and iv
40. The full form of KDD is ………………
a) Knowledge Database
b) Knowledge Discovery Database
c) Knowledge Data House
d) Knowledge Data Definition
41. The output of KDD is ………….
a) Data
b) Information
c) Query
d) Useful information
42. Data mining tasks can be classified into categories called ________.
a) Descriptive mining
b) Predictive mining
c) Both a & b
d) None of them
43. ___________ is a collection of neuron-like processing units with weighted
connections between the units.
a) classification (IF-THEN) rules,
b) decision trees,
c) neural networks
d) None of them

44. A pattern is interesting if it is


a) easily understood by humans
b) valid on new or test data with some degree of certainty
c) possibly useful
d) All of the above

45. An objective measure for association rules is


a) Support
b) Confidence
c) Both a & b
d) None of them

46. The possible integration schemes include


a) No coupling
b) Loose coupling
c) Semi tight coupling
d) All of the above
5
47. Major Tasks in Data Preprocessing are
a) Data cleaning
b) Data integration
c) Data transformation
d) All of the above
48. Data Compression methods
a) String compression
b) Audio/video compression
c) Both a & b
d) None of them
49. Data cleaning routines involves
a) Fill in missing values
b) Smooth out noise
c) outliers
d) All of the above
50. Data smoothing techniques are
a) Binning
b) Regression
c) Clustering
d) All of the above

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy