2 Data mining tasks a functionalities (1)
2 Data mining tasks a functionalities (1)
AND MINING
NISHITHA K C
LECTURER
DEPT.OF BCA
YIASCM
UNIT 1
DATA MINING
TABLE OF CONTENTS
•Data mining tasks
•Data mining functionalities
OBJECTIVE
•Students will be able to understand
data mining tasks .
•This presentation also gives knowledge
about data mining functionalities.
DATA MINING TASKS
• Data mining tasks can be classified into
two categories:
i. Descriptive
ii. Predictive.
i. Descriptive :
• It provides certain knowledge about the
data, for instance, count, average.
• It gives information about what is
happening inside the data without any
previous idea.
• It exhibits the common features in the
data.
• In simple words, the general properties of
ii.Predictive :
• This helps the developers in
understanding the characteristics that
are not explicitly available.
• For instance, the prediction of business
analysis in the next quarter with the
performance of the previous quarters.
• In general, the predictive analysis
predicts or infers the characteristics with
DATA MINING FUNCTIONALITIES
• The functionality of data mining includes :
i. Class/Concept Description: Characterization and
Discrimination
ii. Classification
iii.Prediction
iv.Association Analysis
v. Cluster Analysis
vi.Outlier Analysis
vii.Evolution & Deviation Analysis
i. Class or Concept Description :
• Class or Concept refers to the data that is
linked or correlated with some classes or
some concepts.
• That is, data can be associated with
classes or concepts.
• For example, in the Electronics store,
classes of items for sale include computers
and printers, and concepts of customers
Data Characterization :
When you summarize the general
features of the data, it is called data
characterization.
It produces the characteristic rules
for the target class
The generalized data is presented in
various forms like tables, pie charts,
line charts, bar charts, and graphs.
Data Discrimination :
It means to classify a class with the
help of some predefined group or
class.
It compares common features of
class which is under study.
Data discrimination is a comparison
of the general features of target
class data objects with the general
ii.Classification
• Classification is the process of finding a
model that describes and distinguishes
data classes for the purpose of being able
to use the model to predict the class of
objects whose class label is unknown.
• It uses data models to predict the trends
in data.
iii.Prediction :
• It is a forecasting technique that allows us to find value
deep into the future.
• A huge data set of past values is needed to predict future
trends.
• Prediction finds the missing numeric values in the data.
• It uses regression analysis to find the unavailable data.
• If the class label is missing, then the prediction is done
using classification.
• There are two ways one can predict data:
Predicting the unavailable or missing data using
iv.Association Analysis :
• It relates two or more attributes of the data.
• It discovers the relationship between the data
and the rules that are binding them.
• It associates attributes that are frequently
transacted together.
• They find out what are called association rules
and are widely used in market basket analysis.
• Eg : The suggestion that Amazon shows on the
bottom, “Customers who bought this also
v. Cluster Analysis :
• Unsupervised classification is called
cluster analysis.
• It is similar to the classification where
the data are grouped.
• Unlike classification, in cluster analysis,
the class label is unknown.
• Data are grouped based on clustering
algorithms.
• Clustering analyses data objects
without consulting a known class label.
Fig : Cluster
vi.Outlier Analysis :
• When data that cannot be grouped in any
of the class appears, we use outlier
analysis.
• There will be occurrences of data that will
have different attributes to any of the
other classes or general models.
• These outstanding data are called outliers.
• They are usually considered noise or
• These outliers may be valuable associations in
many applications, although they are usually
discarded as noise.
• They are also called exceptions or surprises, and
it is significant in identifying them.
• The outliers are identified using statistical tests
that find the probability.
• Other names for outliers are:
Deviants
Abnormalities
Discordant
vii.Evolution and Deviation Analysis :
• With evolution analysis, we get time-
related clustering of data.
• Trends and changes in behaviour over
a period can be found.
• We can find features like time-series
data, periodicity, and similarity in
trends with such distinct analysis.
viii.Mining frequent patterns :
• Frequent patterns are nothing but things
that are found to be most common in
the data.
• There are different kinds of frequency
that can be observed in the dataset.
They are :
a)Frequent item set:
This applies to a number of items that
b) Frequent Subsequence:
This refers to the pattern series that
often occurs regularly. Eg : Purchasing
a phone followed by a back cover.
c) Frequent Substructure:
It refers to the different kinds of data
structures such as trees and graphs
that may be combined with the item
set or subsequence.
CONCLUSION
• Data mining tasks can be classified into two
categories: Descriptive and Predictive.
• The functionality of data mining includes :
Class/Concept Description: Characterization and
Discrimination, Classification , Prediction, Association
Analysis, Cluster Analysis, Outlier Analysis, Evolution
& Deviation Analysis and Mining frequent pattern
REFERENCE
• https://www.upgrad.com/blog/data-mining-functionalities
• https://
www.geeksforgeeks.org/tasks-and-functionalities-of-data-mini
ng
Than
k you