0% found this document useful (0 votes)

2 views5 pages

Data Mining Unit-IV

The document discusses key concepts in data mining, focusing on the differences between classification and prediction, issues in data preparation, and various algorithms used for data analysis. It also covers techniques for evaluating classifier accuracy, neural network predictive methods, and tools available for data mining such as DB Miner, DTREG, Weka, and DataMelt. Additionally, it outlines combining techniques like classification, clustering, regression, association rules, outlier detection, and prediction.

Uploaded by

kundurathin06101964

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views5 pages

Data Mining Unit-IV

Uploaded by

kundurathin06101964

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

DATA MINING (UNIT-IV)

1. State the difference between classification and prediction.

Ans. major differences between classification and prediction.

Classification Prediction

Classification is the process of identifying Predication is the process of

which category a new observation belongs identifying the missing or
to based on a training data set containing unavailable numerical data for
observations whose category membership a new observation.
is known.

In classification, the accuracy depends on In prediction, the accuracy

finding the class label correctly. depends on how well a given
predictor can guess the value
of a predicated attribute for
new data.

In classification, the model can be known In prediction, the model can be

as the classifier. known as the predictor.

A model or the classifier is constructed to A model or a predictor will be

find the categorical labels. constructed that predicts a
continuous-valued function or
ordered value.

For example, the grouping of patients For example, We can think of

based on their medical records can be prediction as predicting the
considered a classification. correct treatment for a
particular disease for a person.

2.What are the issues regarding classification and prediction?

Ans. The major issue is preparing the data for Classification and Prediction.
Preparing the data involves the following activities −
Data Cleaning − Data cleaning involves removing the noise and treatment of
missing values. The noise is removed by applying smoothing techniques and the
problem of missing values is solved by replacing a missing value with most
commonly occurring value for that attribute.
Relevance Analysis − Database may also have the irrelevant attributes. Correlation
analysis is used to know whether any two given attributes are related.
Data Transformation and reduction − The data can be transformed by any of the
following methods.
Normalization − The data is transformed using normalization. Normalization
involves scaling all values for given attribute in order to make them fall within a small
specified range. Normalization is used when in the learning step, the neural networks
or the methods involving measurements are used.
Generalization − The data can also be transformed by generalizing it to the higher
concept. For this purpose we can use the concept hierarchies.

3.State:
i) Statistical based algorithm: A statistical or data mining algorithm is a mathematical
expression of certain aspects of the patterns they find in data. Different algorithms provide
different perspectives on the complete nature of the pattern.

ii) Distance based algorithm: Distance-based algorithms are nonparametric

methods that can be used for classification. These algorithms classify objects by
the dissimilarity between them as measured by distance functions. Several
candidate distance functions are reviewed in this chapter along with two
particular classification algorithms.
iii) Neural-Network based algorithm: Neural networks are a series of
algorithms that mimic the operations of an animal brain to recognize
relationships between vast amounts of data. As such, they tend to
resemble the connections of neurons and synapses found in the brain.
iv) Rule based algorithm: Rule-based classification in data mining is a
technique in which class decisions are taken based on various
“if...then… else” rules. Thus, we define it as a classification type
governed by a set of IF-THEN rules. We write an IF-THEN rule as:
“IF condition THEN conclusion.”
4.What are the combining techniques in data mining?

Ans. Classification: This technique is used to obtain important and

relevant information about data and metadata. This data mining
technique helps to classify data in different classes.

Clustering: Clustering is a division of information into groups of

connected objects. Describing the data by a few clusters mainly loses
certain confine details, but accomplishes improvement. It models data by
its clusters.

Regression: Regression analysis is the data mining process is used to

identify and analyze the relationship between variables because of the
presence of the other factor. It is used to define the probability of the
specific variable.

Association Rules: This data mining technique helps to discover a link

between two or more items. It finds a hidden pattern in the data set.

Outer detection: This type of data mining technique relates to the

observation of data items in the data set, which do not match an expected
pattern or expected behavior.

Prediction: Prediction used a combination of other data mining techniques

such as trends, clustering, classification, etc. It analyzes past events or
instances in the right sequence to predict a future event.

5. What is the evaluation of the accuracy of a classifier or

predictor?
The accuracy of a classifier is given as the percentage of total correct predictions
divided by the total number of instances. If the accuracy of the classifier is
considered acceptable, the classifier can be used to classify future data tuples for
which the class label is not known.

6.What are the Techniques To Evaluate Accuracy of Classifier in Data

Mining.
Ans. The techniques to evaluate the accuracy of classifiers.

HoldOut: In the holdout method, the largest dataset is randomly divided into
three subsets:

A training set is a subset of the dataset which are been used to

build predictive models.
 The validation set is a subset of the dataset which is been used to
assess the performance of the model built in the training phase.
 Test sets or unseen examples are the subset of the dataset to
assess the likely future performance of the model.
Random Subsampling: Random subsampling is a variation of the holdout
method. The holdout method is been repeated K times.

Cross-Validation

 K-fold cross-validation is been used when there is only a limited

amount of data available, to achieve an unbiased estimation of the
performance of the model.
 Here, we divide the data into K subsets of equal sizes.
Bootstrapping

 Bootstrapping is one of the techniques which is used to make the

estimations from the data by taking an average of the estimates
from smaller data samples.
 The bootstrapping method involves the iterative resampling of a
dataset with replacement.

7.What are the neural network predictive methods?

Ans. Predictive neural networks are a sophisticated data mining application that
imitate the function of the brain to detect patterns in data sets. These mathematical
models can detect the most subtle and complex relationships between your
variables. This type of predictive modelling is used in energy & utilities, healthcare &
pharmaceuticals, insurance & reinsurance, finance & banking, manufacturing &
consumer goods, logistics & transportation, and other fields. Applications include:

 Price prediction
 Reserves estimation
 Fraud detection
 Credit advising
 Load forecasting
 Process modeling and control
 Portfolio management
 Financial planning
 Machine diagnostics
 Medical diagnosis and more

8.What are the tools in data mining?

Ans. DB Miner: A data mining system, DBMiner, has been developed for
interactive mining of multiple-level knowledge in large relational
databases and data warehouses. The system implements a wide
spectrum of data mining functions, including characterization,
comparison, association, classification, prediction, and clustering.

DTREG: DTREG (pronounced D-T-Reg) builds classification and regression

decision trees, neural networks, support vector machine (SVM), GMDH
polynomial networks, gene expression programs, K-Means clustering,
discriminant analysis and logistic regression models that describe data
relationships and can be used to predict values for future observations. DTREG
also has full support for timeseries analysis

Weka

 It is open source and free software.

 It is best suited for data analysis and predictive modelling.
 It contains algorithms and visualization tools that support data
mining tasks and machine learning.
 Weka has a GUI that gives easy access to all its features.
 It is written in JAVA language.

DM: DataMelt is a computation and visualization environment

which offers an interactive structure for data analysis and
visualization. It is primarily designed for students, engineers,
and scientists. It is also known as DMelt. t consists of Science
and mathematics libraries.

o Scientific libraries: Scientific libraries are used for

drawing the 2D/3D plots.
o Mathematical libraries: Mathematical libraries are used
for random number generation, algorithms, curve fitting,
etc.

Tense Class 6
100% (1)
Tense Class 6
3 pages
DM passing package
No ratings yet
DM passing package
38 pages
Ethics - Wikipedia
No ratings yet
Ethics - Wikipedia
319 pages
Topic 4 - Data Mining Tools and Technique
No ratings yet
Topic 4 - Data Mining Tools and Technique
22 pages
Bird Migration: A New Understanding John H. Rappole - The ebook is ready for download with just one simple click
100% (3)
Bird Migration: A New Understanding John H. Rappole - The ebook is ready for download with just one simple click
31 pages
Embedded System_UG_Eng_3rd Yr (6)
No ratings yet
Embedded System_UG_Eng_3rd Yr (6)
259 pages
Classfication and Prediction
No ratings yet
Classfication and Prediction
133 pages
Johnson & Johnson
No ratings yet
Johnson & Johnson
80 pages
Data Mining: Concepts and Techniques: - Chapter 6
No ratings yet
Data Mining: Concepts and Techniques: - Chapter 6
115 pages
Data Warehouse and Mining Notes
No ratings yet
Data Warehouse and Mining Notes
12 pages
Lecture 10
No ratings yet
Lecture 10
53 pages
Dataming Cat Answers
No ratings yet
Dataming Cat Answers
43 pages
Lecture 3.1.1
No ratings yet
Lecture 3.1.1
17 pages
CHAPTER1-datamining
No ratings yet
CHAPTER1-datamining
33 pages
Definition of Hypothesis in Research
100% (3)
Definition of Hypothesis in Research
5 pages
Datamining Quiz
No ratings yet
Datamining Quiz
173 pages
DWM Merged
No ratings yet
DWM Merged
125 pages
DM_UNIT-1_FUNDAMENTALS OF DATA MINING (1)
No ratings yet
DM_UNIT-1_FUNDAMENTALS OF DATA MINING (1)
43 pages
Gnther Patzig Aristotle39s Theory of The Syllogism PDF
No ratings yet
Gnther Patzig Aristotle39s Theory of The Syllogism PDF
231 pages
Data Mining Research Paper
No ratings yet
Data Mining Research Paper
15 pages
2 unit
No ratings yet
2 unit
15 pages
Short Notes On Data Mining & Warehousing
No ratings yet
Short Notes On Data Mining & Warehousing
43 pages
unit 1 DM
No ratings yet
unit 1 DM
24 pages
ROXII v2.13 RX1500 User-Guide CLI EN PDF
No ratings yet
ROXII v2.13 RX1500 User-Guide CLI EN PDF
892 pages
On Unit-3
No ratings yet
On Unit-3
30 pages
Appendix e
No ratings yet
Appendix e
17 pages
5 What Is Data-WPS Office
No ratings yet
5 What Is Data-WPS Office
19 pages
Madhav Institute of Technology & Science, Gwalior
No ratings yet
Madhav Institute of Technology & Science, Gwalior
28 pages
Dwdm Unit-II Notes
No ratings yet
Dwdm Unit-II Notes
29 pages
dataminingshort Question part2
No ratings yet
dataminingshort Question part2
17 pages
Fujipress - JACIII 21 1 5
No ratings yet
Fujipress - JACIII 21 1 5
18 pages
10 Algorithms That Dominate The World
No ratings yet
10 Algorithms That Dominate The World
26 pages
Jes2 Spool
No ratings yet
Jes2 Spool
45 pages
Data Mining
No ratings yet
Data Mining
30 pages
Advanced JavaScript
No ratings yet
Advanced JavaScript
1,130 pages
Quantum Mechanics - 5
No ratings yet
Quantum Mechanics - 5
20 pages
Data Mining: Concepts and Techniques: - Chapter 6
No ratings yet
Data Mining: Concepts and Techniques: - Chapter 6
129 pages
Lawo Plugin Collection Operators Manual V1.0 - 4
No ratings yet
Lawo Plugin Collection Operators Manual V1.0 - 4
53 pages
Discussion Questions BA
No ratings yet
Discussion Questions BA
11 pages
Classification - Prediction Data Model Very Important
No ratings yet
Classification - Prediction Data Model Very Important
173 pages
Data Mining Unit-II
No ratings yet
Data Mining Unit-II
4 pages
Lista de Documentales
No ratings yet
Lista de Documentales
66 pages
Data Mining
No ratings yet
Data Mining
7 pages
Hyphenated Techniques: For Analysis of Biological Samples
No ratings yet
Hyphenated Techniques: For Analysis of Biological Samples
43 pages
Data Mining
No ratings yet
Data Mining
20 pages
Data Mining Real
No ratings yet
Data Mining Real
19 pages
Data Mining_dm 1-5 Question Bank
No ratings yet
Data Mining_dm 1-5 Question Bank
10 pages
Data Mining Unit-i
No ratings yet
Data Mining Unit-i
5 pages
STAT243 Chapter 1 Tutorial Questions With Solutions_23
No ratings yet
STAT243 Chapter 1 Tutorial Questions With Solutions_23
3 pages
Operational Guidelines For Cea
No ratings yet
Operational Guidelines For Cea
178 pages
10 Ssc Holiday Homework (25-26)-1
No ratings yet
10 Ssc Holiday Homework (25-26)-1
3 pages
10.1.1.449.1341
No ratings yet
10.1.1.449.1341
3 pages
Notes On Coronectomy
No ratings yet
Notes On Coronectomy
4 pages
DM Chapter 4
No ratings yet
DM Chapter 4
47 pages
Chapter4 Classification Prediction
No ratings yet
Chapter4 Classification Prediction
173 pages
Mmds
No ratings yet
Mmds
12 pages
Ques 1.give Some Examples of Data Preprocessing Techniques?: Assignment - DWDM Submitted By-Tanya Sikka 1719210284
No ratings yet
Ques 1.give Some Examples of Data Preprocessing Techniques?: Assignment - DWDM Submitted By-Tanya Sikka 1719210284
7 pages
BI_Unit 5
No ratings yet
BI_Unit 5
9 pages
Unit-IV Classification Part 1
No ratings yet
Unit-IV Classification Part 1
38 pages
Kolb Learning Styles Diagram
No ratings yet
Kolb Learning Styles Diagram
1 page
DA5.6 Marketing Analytics q&a
No ratings yet
DA5.6 Marketing Analytics q&a
4 pages
The Medical Corps Hospitalization and Evacuation, Zone of Interior
100% (1)
The Medical Corps Hospitalization and Evacuation, Zone of Interior
528 pages
Burda - 05 2021
50% (2)
Burda - 05 2021
88 pages
Japan and Philippines Similarities Differences
No ratings yet
Japan and Philippines Similarities Differences
5 pages
Article 6
No ratings yet
Article 6
6 pages
DMDA Viva Questions-1
No ratings yet
DMDA Viva Questions-1
7 pages
Survey Paper SN
No ratings yet
Survey Paper SN
4 pages
Data miningng
No ratings yet
Data miningng
8 pages
Objectives Questions for Data Mining
No ratings yet
Objectives Questions for Data Mining
4 pages
Assignment 3
No ratings yet
Assignment 3
4 pages
Data Mining: Concepts and Techniques
100% (2)
Data Mining: Concepts and Techniques
139 pages
Data Mining: Concepts and Techniques: - Chapter 6
No ratings yet
Data Mining: Concepts and Techniques: - Chapter 6
172 pages
Unit-1 PPT
No ratings yet
Unit-1 PPT
21 pages
Laiho Alexi Method 2
No ratings yet
Laiho Alexi Method 2
75 pages
DM Ch6 (Classification and Prediction)
No ratings yet
DM Ch6 (Classification and Prediction)
39 pages
DM-Model Question Paper Solutions
No ratings yet
DM-Model Question Paper Solutions
27 pages
Tense 2 Class 6
100% (1)
Tense 2 Class 6
8 pages
week 12 y5 answer key
No ratings yet
week 12 y5 answer key
8 pages
Knowledge Management UNIT-3 Notes
No ratings yet
Knowledge Management UNIT-3 Notes
17 pages
Journal On Decision Tree
No ratings yet
Journal On Decision Tree
5 pages
Data Mining Information
100% (1)
Data Mining Information
15 pages
Assignment Solution 074
No ratings yet
Assignment Solution 074
8 pages
Classification and Prediction
No ratings yet
Classification and Prediction
126 pages
Data Mining: Concepts and Techniques: - Chapter 6
No ratings yet
Data Mining: Concepts and Techniques: - Chapter 6
129 pages
OBE Syllabus Human Resource Management San Francisco College
No ratings yet
OBE Syllabus Human Resource Management San Francisco College
4 pages
Into The Flames
100% (1)
Into The Flames
38 pages
Mid Term Exam Marketing 04.2017
No ratings yet
Mid Term Exam Marketing 04.2017
8 pages
Individual Case Study
No ratings yet
Individual Case Study
13 pages
Varnishes Liners Bases
No ratings yet
Varnishes Liners Bases
35 pages
Hemiplegia
100% (1)
Hemiplegia
12 pages
Floyd Warshall
No ratings yet
Floyd Warshall
6 pages
Statistical Classification: Fundamentals and Applications
From Everand
Statistical Classification: Fundamentals and Applications
Fouad Sabry
No ratings yet
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Data Mining Unit-IV

Uploaded by

Data Mining Unit-IV

Uploaded by

DATA MINING (UNIT-IV)

1. State the difference between classification and prediction.

Ans. major differences between classification and prediction.

Classification is the process of identifying Predication is the process of

In classification, the accuracy depends on In prediction, the accuracy

In classification, the model can be known In prediction, the model can be

A model or the classifier is constructed to A model or a predictor will be

For example, the grouping of patients For example, We can think of

2.What are the issues regarding classification and prediction?

ii) Distance based algorithm: Distance-based algorithms are nonparametric

Ans. Classification: This technique is used to obtain important and

Clustering: Clustering is a division of information into groups of

Regression: Regression analysis is the data mining process is used to

Association Rules: This data mining technique helps to discover a link

Outer detection: This type of data mining technique relates to the

Prediction: Prediction used a combination of other data mining techniques

5. What is the evaluation of the accuracy of a classifier or

6.What are the Techniques To Evaluate Accuracy of Classifier in Data

A training set is a subset of the dataset which are been used to

 K-fold cross-validation is been used when there is only a limited

 Bootstrapping is one of the techniques which is used to make the

7.What are the neural network predictive methods?

8.What are the tools in data mining?

DTREG: DTREG (pronounced D-T-Reg) builds classification and regression

 It is open source and free software.

DM: DataMelt is a computation and visualization environment

o Scientific libraries: Scientific libraries are used for

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.