0% found this document useful (0 votes)

183 views6 pages

The Full Form of KDD Is

The data warehouse backend process involves data extraction, data cleansing, data transformation and data loading. The main OLAP operations are roll-up, drill-down, slice, dice and pivot. Roll-up aggregates data along a dimension hierarchy or by removing dimensions. Drill-down navigates to more granular data by descending a hierarchy or adding dimensions. Slice selects a subset of data on one dimension. Dice selects subsets across multiple dimensions. Pivot rotates or transforms the view of data in a cube.

Uploaded by

Arshad Ali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

183 views6 pages

The Full Form of KDD Is

Uploaded by

Arshad Ali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

a) The full form of KDD is

i) Knowledge database
ii) Knowledge discovery in databases
iii) Knowledge data division
iv) Knowledge data definition
b) You are given data about seismic activity in japan, and you want to predict a magnitude of the
next earthquake , this is an example of
i) Supervised learning
ii) Unsupervised learning
iii) Serration
iv) Dimensionality reduction
c) Which of the following data not involve in data mining ?
i) Knowledge extraction
ii) Data archaeology
iii) Data exploration
iv) Data transformation
d) ………………….. is a comparison of the general features of the target class data objects against the
general features of objects form one or multiple contrasting classes.
i) Data characterization
ii) Data classification
iii) Data discrimination
iv) Data selection
e) Bayesian classifiers is
i) A class of learning algorithm that tries to find an optimum classification of a set of
examples using the probabilistic theory.
ii) Any mechanism employed by a learning system to constrain the search space of a
hypothesis.
iii) An approach to the design of learning algorithms that is inspired by the fact that when
people encounter new situations, they often explain them by reference to familiar
experiences, adapting the explanation to fit the new situation.
iv) None of the above.
f) The output of KDD is
i) Data
ii) Information
iii) Query
iv) Useful information
g) Cluster is
i) Group of similar objects that differ significantly from other objects
ii) Operations on a database to transform or simplify data in order to prepare it for a
machine learning algorithm
iii) Symbolic representation of facts or ideas from which information can potentially be
extracted
iv) None of the above
h) Background knowledge referred to
i) Additional acquaintance used by a learning algorithm to facilitate the learning process
ii) A neural network that makes use of a hidden layer
iii) It is a form of automatic learning
iv) None of the above
i) Case-based learning is
i) A class of learning algorithm that tries to find an optimum classification of a set of
examples using the probabilistic theory
ii) Any mechanism employed by a learning system to constrain the search space of a
hypothesis
iii) An approach to the design of learning algorithms that is inspired by the fact that when
people encounter new situations, they often explain them by reference to familiar
experiences, adapting the explanation to fit the new situation.
iv) None of the above.
j) Some telecommunication companies want to segment their customers into distinct groups in
order to send appropriate subscription offers. This is an example of
i) Supervised learning
ii) Data extraction
iii) Serration
iv) Unsupervised learning

2. A) compare and contrast data warehouse system and operational database system.

Operational Database Data Warehouse

Operational systems are designed to support Data warehousing systems are typically
high-volume transaction processing. designed to support high-volume analytical
processing (i.e., OLAP).

Operational systems are usually concerned with Data warehousing systems are usually
current data. concerned with historical data.

Data within operational systems are mainly Non-volatile, new data may be added regularly.
updated regularly according to need. Once Added rarely changed.

It is designed for real-time business dealing and It is designed for analysis of business measures
processes. by subject area, categories, and attributes.

It is optimized for a simple set of transactions, It is optimized for extent loads and high,
generally adding or retrieving a single row at a complex, unpredictable queries that access
time per table. many rows per table.
It is optimized for validation of incoming Loaded with consistent, valid information,
information during transactions, uses validation requires no real-time validation.
data tables.

It supports thousands of concurrent clients. It supports a few concurrent clients relative to

OLTP.

Operational systems are widely process-oriented. Data warehousing systems are widely subject-
oriented

Operational systems are usually optimized to Data warehousing systems are usually
perform fast inserts and updates of associatively optimized to perform fast retrievals of relatively
small volumes of data. high volumes of data.

Data In Data Out

Less Number of data accessed. Large Number of data accessed.

Relational databases are created for on-line Data Warehouse designed for on-line Analytical
transactional Processing (OLTP) Processing (OLAP)

b) Describe the steps involved in data mining when viewed as a process of knowledge discovery.

Answer:
1. Data Cleaning
2. Data Integration
3. Data Selection
4. Data transformation
5. Data Mining
6. Pattern Evaluation
7. Knowledge presentation
Explanation:
The steps involved in data mining or data analytics when viewed as a process of
knowledge discovery includes the following:
Step 1. Data cleaning: this involves the elimination of inconsistent data.
Step 2. Data integration: this involves the combination of data from multiple sources.
Step 3. Data selection: this is the step where significant data for task examination
are gathered from the database.
Step 4. Data transformation: this the step in which the data are modified for mining
by conducting the aggregate operation.
Step 5. Data mining: this step involves the extraction of data patterns through
specific techniques.
Step 6. Pattern evaluation: this step involves the identification of patterns that depict
knowledge based on measures.
Step 7. Knowledge presentation: this is the step in which visualization and
knowledge representation methods are utilized to illustrate mined knowledge to
users.
3. a) What is data warehouse backend process? Explain. Briefly

b) explain different OLAP operations.

Roll-up (Drill-up): The roll-up operation performs aggregation on a data cube, either by climbing
up a concept hierarchy for a dimension or by dimension reduction.

• Performing roll-up using climbing up a concept hierarchy :

o Consider a hierarchy defined as the total order “street<city <province="" or="" state=""
<country.”="" <="" p="">

o Rather than grouping the data by city, the resulting cube groups the data by country.

• Performing roll-up using dimension reduction:

o One or more dimensions are removed from the given cube.

o Consider a sales data cube containing only the two dimensions location and time.

o Roll-up may be performed by removing the time dimension, resulting in an aggregation of the
total sales by location, rather than by both location and by time.

Drill-down: Drill-down is the reverse of roll-up. It navigates from less detailed data to more
detailed data. Drill-down can be realized by either stepping down a concept hierarchy for a
dimension or introducing additional dimensions.

• Performing a drill-down operation using stepping down a concept hierarchy

o Consider time defined as “day <month <quarter="" <year.”="" <="" p="">

o Drill-down occurs by descending the time hierarchy from the level of quarter to the more
detailed level of month.
• Performing a drill-down operation by adding new dimensions to a cube

o Consider the central cube of the figure

o A drill-down on can occur by introducing an additional dimension, such as customer group.

Slice: The slice operation performs a selection on one dimension of the given cube, resulting in
a sub cube.

o The figure shows a slice operation where the sales data are selected from the central cube for
the dimension ‘time’ using the criterion ‘time= “Q1” ’.

Dice: The dice operation defines a sub cube by performing a selection on two or more
dimensions.

o The figure shows a dice operation on the central cube based on the following selection criteria
that involve three dimensions: (location = “Toronto” or “Vancouver”) and (time = “Q1” or “Q2”)
and (item = “home entertainment” or “computer”).

Pivot (rotate): Pivot is a visualization operation that rotates the data axes in view in order to
provide an alternative presentation of the data.

o The figure shows a pivot operation where the item and location axes in a 2-D slice are rotated.

o Other examples include rotating the axes in a 3-D cube, or transforming a 3-D cube into a
series of 2-D planes.

Other OLAP operations (extra points for reference)

• Drill-across operation executes queries involving more than one fact table.

• Drill-through operation uses relational SQL facilities to drill through the bottom level of a data
cube down to its back-end relational tables.

• ranking the top N or bottom N items in lists, as well as computing moving averages, growth
rates, interests, internal rates of return, depreciation, currency conversions, and statistical
functions.

• OLAP offers analytical modeling capabilities, including a calculation engine for deriving ratios,
variance, and so on, and for computing measures across multiple dimensions.

• It can generate summarizations, aggregations, and hierarchies at each granularity level and at
every dimension intersection.

• OLAP also supports functional models for forecasting, trend analysis, and statistical analysis.
In this context, an OLAP engine is a powerful data analysis tool.
4. a) what is data cleaning? Describe various approaches for cleaning data having missing values.

b) use the two methods below to normalize the following group of data :

200; 300; 400; 600; 1000

Min-max normalization by setting min= 0 and max = 1

z-score normalization

c) What is the value range of the following normalization methods?

Min-max normalization

z-score normalization

normalization by decimal scaling.

5. a) write and explain pseudo code for a priori algorithm. Explain the terms

i) support count;

ii) confidence.

b) Consider a database D , consisting is transactions. Suppose minimum support count is 2 (i.e,

min_sup = 20% ) and minimum confidence required is 70% . find out the frequent item set using a priori
algorithm. Explain each step with diagram :

6. Draw decision tree for the following data sets. Use entropy as a node selection mechanism:

7. a) what are the main requirements for cluster analysis?

b) explain different basic clustering methods.

8. a) what is multilevel association rule mining? Explain different approaches to do multilevel

association rule mining.

b) explain naïve Bayesian classification algorithm.

9. a) with neat diagram, explain the architecture of data warehouse. Explain the terms ROLAP, MOLAP
and HOLAP.

b) what are the differences between the three main type of data warehouse usage information
processing, analytical processing and data mining? Discuss the motivation behind OLAP mining ( OLAM).

Answer Key Advanced EDB Postgres v15
No ratings yet
Answer Key Advanced EDB Postgres v15
13 pages
Weekly Quiz 2 (AS) - PGPBABI.O.OCT19 Advanced Statistics - Great Learning PDF
No ratings yet
Weekly Quiz 2 (AS) - PGPBABI.O.OCT19 Advanced Statistics - Great Learning PDF
5 pages
Student Solution Chap 09
No ratings yet
Student Solution Chap 09
10 pages
DMW Merged
No ratings yet
DMW Merged
454 pages
U1-U5 Consolidated PDF
No ratings yet
U1-U5 Consolidated PDF
222 pages
Domain 8 - Software Development Security
No ratings yet
Domain 8 - Software Development Security
19 pages
Current Trends
No ratings yet
Current Trends
35 pages
Lecture OLAP & Operation
No ratings yet
Lecture OLAP & Operation
47 pages
Blockchain Merged
No ratings yet
Blockchain Merged
71 pages
Challan Form
No ratings yet
Challan Form
72 pages
Data Mining and Data Warehousing
No ratings yet
Data Mining and Data Warehousing
25 pages
Bi 1nov2017 One
No ratings yet
Bi 1nov2017 One
10 pages
Dam301 Data Mining and Data Warehousing Summary 08024665051
No ratings yet
Dam301 Data Mining and Data Warehousing Summary 08024665051
48 pages
6 TheRealTimeFaceDetectionandRecognitionSystem
No ratings yet
6 TheRealTimeFaceDetectionandRecognitionSystem
48 pages
DBMS Question Bank PDF
No ratings yet
DBMS Question Bank PDF
10 pages
Assignment No 2
No ratings yet
Assignment No 2
26 pages
DBMS-Unit 3
No ratings yet
DBMS-Unit 3
30 pages
Question Bank: Q1) What Is Data Warehouse?
No ratings yet
Question Bank: Q1) What Is Data Warehouse?
17 pages
Power BI Resume 02
No ratings yet
Power BI Resume 02
2 pages
Data Mining Assignment
0% (1)
Data Mining Assignment
11 pages
Narendra - CV 2.9 Yrs Ex
0% (1)
Narendra - CV 2.9 Yrs Ex
4 pages
SNSW Unit-4
No ratings yet
SNSW Unit-4
8 pages
1 Introduction
No ratings yet
1 Introduction
66 pages
Unit 1
No ratings yet
Unit 1
9 pages
Viva Questions For Data Mining and Warehousing: Q1. Ans.
No ratings yet
Viva Questions For Data Mining and Warehousing: Q1. Ans.
13 pages
Lecture 1428550844
No ratings yet
Lecture 1428550844
11 pages
DMDW Question Bank
No ratings yet
DMDW Question Bank
17 pages
Answer: - : I Use Waterfall Model For The Specification Given Above
No ratings yet
Answer: - : I Use Waterfall Model For The Specification Given Above
16 pages
Unit 2
No ratings yet
Unit 2
34 pages
Unit 1
No ratings yet
Unit 1
11 pages
SITA Aircom Server User Guide
No ratings yet
SITA Aircom Server User Guide
153 pages
Normalization: Normalization Techniques at A Glance
No ratings yet
Normalization: Normalization Techniques at A Glance
5 pages
Data Mining Notes
No ratings yet
Data Mining Notes
14 pages
Sol CT75
No ratings yet
Sol CT75
11 pages
Defining Data Mining and Data Warehouse
No ratings yet
Defining Data Mining and Data Warehouse
10 pages
Unit2 Olap
No ratings yet
Unit2 Olap
13 pages
Chapter 2.introduction To Data Warehouse
No ratings yet
Chapter 2.introduction To Data Warehouse
49 pages
Soln 1
100% (1)
Soln 1
6 pages
SQL Server Security (Logins, Users - Fixed Roles)
No ratings yet
SQL Server Security (Logins, Users - Fixed Roles)
3 pages
Vivaquestions
No ratings yet
Vivaquestions
14 pages
Student Solution Chap 08
No ratings yet
Student Solution Chap 08
6 pages
DM - MOD - 1 Part I
No ratings yet
DM - MOD - 1 Part I
9 pages
Multi-Tier Architecture
No ratings yet
Multi-Tier Architecture
3 pages
Question Bank: Data Warehousing and Data Mining Semester: VII
No ratings yet
Question Bank: Data Warehousing and Data Mining Semester: VII
4 pages
SM Sbe13e Chapter 03
No ratings yet
SM Sbe13e Chapter 03
40 pages
Bi Lesson 6
No ratings yet
Bi Lesson 6
36 pages
Data Mining and Data Warehousing Notes ct1
No ratings yet
Data Mining and Data Warehousing Notes ct1
12 pages
DM&DW MCQs
No ratings yet
DM&DW MCQs
16 pages
Assignment of DMDW kg11
No ratings yet
Assignment of DMDW kg11
17 pages
Unit 4
No ratings yet
Unit 4
27 pages
Oracle Database Performance Tuning
No ratings yet
Oracle Database Performance Tuning
10 pages
Digital Signature: Review Questions
No ratings yet
Digital Signature: Review Questions
6 pages
Defining Data Mining and Data Warehouse (Adugna Gutema)
No ratings yet
Defining Data Mining and Data Warehouse (Adugna Gutema)
9 pages
HAJJATII
No ratings yet
HAJJATII
11 pages
Question Bank For DMDW
No ratings yet
Question Bank For DMDW
10 pages
DataminingWarehousing Module 1 PPT Notes
No ratings yet
DataminingWarehousing Module 1 PPT Notes
95 pages
J 3025-Data Mining and Warehousing
No ratings yet
J 3025-Data Mining and Warehousing
12 pages
DMBI Viva
No ratings yet
DMBI Viva
18 pages
Dba 101
No ratings yet
Dba 101
15 pages
Data Mining - Assignment
No ratings yet
Data Mining - Assignment
15 pages
DMDW Imp Ques
No ratings yet
DMDW Imp Ques
17 pages
Data Mining Note
No ratings yet
Data Mining Note
3 pages
Data Mining Important
No ratings yet
Data Mining Important
15 pages
CST 466
No ratings yet
CST 466
24 pages
DWM
No ratings yet
DWM
29 pages
Data Mining Imp Solutions
No ratings yet
Data Mining Imp Solutions
6 pages
DWDM Syllabus
No ratings yet
DWDM Syllabus
2 pages
DWDM 2
No ratings yet
DWDM 2
15 pages
DBMS Part2
No ratings yet
DBMS Part2
23 pages
Chapter 2 - Data Science
No ratings yet
Chapter 2 - Data Science
20 pages
Dataqb
No ratings yet
Dataqb
38 pages
Business Analytics Lab Manual - Complete Program
No ratings yet
Business Analytics Lab Manual - Complete Program
85 pages
Assignment 2
No ratings yet
Assignment 2
3 pages
Data Science
No ratings yet
Data Science
13 pages
Unit 2
No ratings yet
Unit 2
144 pages
SADCW 7e Chapter07
No ratings yet
SADCW 7e Chapter07
30 pages
Adbms Unit5
No ratings yet
Adbms Unit5
10 pages
Unit IV Notes
No ratings yet
Unit IV Notes
47 pages
TDD and Migration Document For OIC
No ratings yet
TDD and Migration Document For OIC
17 pages
DATA Warehouse MCQs
No ratings yet
DATA Warehouse MCQs
41 pages
Building A GraphQL API With Django
No ratings yet
Building A GraphQL API With Django
15 pages
DWM Paper
No ratings yet
DWM Paper
10 pages
Cs4267-Applied Business Analytics
No ratings yet
Cs4267-Applied Business Analytics
3 pages
MCQ-Part1-2025-Question Bank (PEC-CSBS601D)
No ratings yet
MCQ-Part1-2025-Question Bank (PEC-CSBS601D)
7 pages
DMA QB Solved
No ratings yet
DMA QB Solved
42 pages
DM HarshQuesAns
No ratings yet
DM HarshQuesAns
183 pages
Sertifikat - MUHAMMAD ELDWIN PASARIBU - Database Design & Programming With SQL
No ratings yet
Sertifikat - MUHAMMAD ELDWIN PASARIBU - Database Design & Programming With SQL
3 pages
Interview Questions and Answers
No ratings yet
Interview Questions and Answers
12 pages
Data Mining Edited
No ratings yet
Data Mining Edited
29 pages
2-Concept Hierarchy To Classification of DMS
No ratings yet
2-Concept Hierarchy To Classification of DMS
75 pages
IBM - IBM Watsonx - Data
No ratings yet
IBM - IBM Watsonx - Data
15 pages
DWM (Data Warehousing and Mining) : By: Akatsuki
No ratings yet
DWM (Data Warehousing and Mining) : By: Akatsuki
12 pages
Abhishek Chakraborty Ece 2021
No ratings yet
Abhishek Chakraborty Ece 2021
2 pages
IGNOU MCA Data Warehousing and Data Mining Previous Years Unsolved Papers MCS 221
From Everand
IGNOU MCA Data Warehousing and Data Mining Previous Years Unsolved Papers MCS 221
Manish Soni
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

The Full Form of KDD Is

Uploaded by

The Full Form of KDD Is

Uploaded by

a) The full form of KDD is

Operational Database Data Warehouse

It supports thousands of concurrent clients. It supports a few concurrent clients relative to

Data In Data Out

Less Number of data accessed. Large Number of data accessed.

b) explain different OLAP operations.

• Performing roll-up using climbing up a concept hierarchy :

• Performing roll-up using dimension reduction:

o One or more dimensions are removed from the given cube.

• Performing a drill-down operation using stepping down a concept hierarchy

o Consider time defined as “day <month <quarter="" <year.”="" <="" p="">

o Consider the central cube of the figure

o A drill-down on can occur by introducing an additional dimension, such as customer group.

Other OLAP operations (extra points for reference)

200; 300; 400; 600; 1000

Min-max normalization by setting min= 0 and max = 1

c) What is the value range of the following normalization methods?

normalization by decimal scaling.

b) Consider a database D , consisting is transactions. Suppose minimum support count is 2 (i.e,

7. a) what are the main requirements for cluster analysis?

b) explain different basic clustering methods.

8. a) what is multilevel association rule mining? Explain different approaches to do multilevel

b) explain naïve Bayesian classification algorithm.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.