0% found this document useful (0 votes)

16 views3 pages

Application of Data Mining - A Survey Paper: Aarti Sharma, Rahul Sharma, Vivek Kr. Sharma, Vishal Shrivatava

The document is a survey paper on data mining, highlighting its importance in converting raw data into useful information across various fields, particularly in medicine. It outlines the data mining process, techniques such as classification, clustering, and prediction, and their applications in areas like marketing and fraud detection. The paper emphasizes the role of data mining in decision support and disease prediction, showcasing its potential to enhance healthcare outcomes.

Uploaded by

Prasanth Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views3 pages

Application of Data Mining - A Survey Paper: Aarti Sharma, Rahul Sharma, Vivek Kr. Sharma, Vishal Shrivatava

Uploaded by

Prasanth Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Aarti Sharma et al, / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol.

5 (2) , 2014, 2023-2025

Application of Data Mining – A Survey Paper

Aarti Sharma, Rahul Sharma,Vivek Kr. Sharma,Vishal Shrivatava

Department of CS &IT .,
A.C.E.I.T.,Jaipur

Abstract— Data mining is a powerful and a new field having

various techniques. It converts the raw data into useful
information in various research fields. It helps in finding the
patterns to decide future trends in medical field.

Keyword: Data mining, information prediction, raw data .

I. INTRODUCTION
Development of information technology has generated large
amount of data-base and huge amount of data in various
research fields. To research in knowledge mining has give
rise to store data and manipulate previously stored data for
further decision making process.

II. DATA MINING PROCESS

Data mining is used to extract implicit and previously
unknown information from data. Data mining is the process Fig 1. Data Mining Process
which provides a concept to attract attention of users due to
high availability of huge amount of data and need to convert In the diagram data mining is the main part of knowledge
such data into useful information. discovery process.
So, many people use the term “knowledge discovery device”
or KDD for data mining. Data mining applications:
Knowledge extraction or discovery is done in seven  Marketing: Customer profiling, retention, identification
sequential steps used in data mining: of potential customer, market segmentation.
 Fraud detection: Identify credit card fraud and intrusion
detection.
1) Data cleaning: we remove noise data and irrelevant data
 Scientific data analysis: Identify the research decision
from collected raw data, at this step.
making data.
2) Data integration: At this step, we combine multiple data
sources into single data store called target data.  Text and web mining: used to search text or information
3) Data Selection: Here, data relevant to analysis task are on web or given raw data.
retrieved from data base as pre-processed data.  Any other applications that involve large amount of data.
4) Data transformation: Here, data is consolidating into
standard formats appropriate for mining by III. DATA MINING TECHNIQUES [1]
summarizing and aggregated operations. There are various major data mining techniques that have
5) Data Mining: At this step, various smart techniques and been developed and used in data mining projects recently
tools are applied in order to extract data pattern or rules. including association, rule classification, clustering,
6) Pattern evaluation: At this step, strictly identify tree prediction and Evaluation pattern etc., are used for
patterns representing knowledge. knowledge discovery from database.
7) Knowledge representation: This is the last stage in which,
visualization and knowledge representation techniques 1. Association:It is one of the most popular data mining
are used to help users to understand and interpret the techniques. In this technique we mine frequent patterns lead
data mining knowledge or result. to discovery of interesting association and correlations
within data.
Example:
The goal of knowledge discovery and data mining process is Association technique is used in marketing analysis to
to find the patterns that are hidden among the huge set of identify items which are frequently purchased within the
data and interpret useful knowledge and information. same transactions.

www.ijcsit.com 2023
Aarti Sharma et al, / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 5 (2) , 2014, 2023-2025

An example of such a rule, mined from the All Electronics functions. That is, prediction is used to predict missing or
transactional database, is unavailable numerical data values rather than class labels.
buys(X; “computer”))buys(X; “software”) [support = 1%; But, the term predictionmayrefer to both numeric prediction
confidence = 50%]where X is a variable representing a and class label prediction.
customer. A confidence, or certainty, of 50% means that if a Example: Regression analysis is a statistical methodology
customer buys a computer, there is a 50% chance that she that is most often used for numeric prediction, although
will buy software as well. A 1% support means that 1% of all other methods exist as well. Prediction also encompasses the
of the transactions under analysis showed that computer and identification of distribution trends based on the available
software were purchased together.as single-dimensional data.
association rules. Dropping the predicate notation, the above Applications of prediction:
rule can be written simply as “computer)software[1%,  Credit approval
50%]”.  Target marketing
 Medical diagnosis
2. Classification:It is the process of finding a model or  Treatment effectiveness analysis
function that describes & distinguish data classes or
concepts for the purpose of being able to use the model to 4. EVALUATION PATTERN:
predict the class of object whose class label is unknown. Data evolution analysis describes and models regularities or
In classification, we make software that can learn how to trends for objects whose behavior changes over time.
classify the data items into group . Derived model can be Although this may include characterization, discrimination,
presented as classification or rules. So, association and correlation analysis, classification,
Classification techniques: prediction, or clustering of time related data, distinct
 Regression features of such an analysis include time-series data
 Distance analysis, sequence or periodicity pattern matching, and
 Decision similarity-based data analysis.
 Rules Example: Evolution analysis. Suppose that you have the
 Neural networks major stock market (time-series) data of the last several
years available from the New York Stock Exchange and you
3. Clustering: Process of grouping a set of physical or would like to invest in shares of high-tech industrial
abstract object into classes of similar objects is called companies. A data mining study of stock exchange data may
clustering. identify stock evolution regularities for overall stocks and
A cluster is a collection of objects which are “similar” for the stocks of particular companies. Such regularities may
between them and are “dissimilar” to the objects belonging help predict future trends in stock market prices,
to other clusters. contributing to your decision making regarding stock
investments.
Selected data mining techniques in medicine
There are various data mining techniques available with
suitable dependent on domain application.
By using data mining we can examine large amount of
routine samples collected in disease prediction. Best results
are achieved by balancing knowledge of experts for
describing the problem and goals with search capabilities.
Hospitals must also want to minimize cost of clinical test. It
can be achieved by employing appropriate computer based
information and decision sport system. Here, data mining
Fig 2. Clustering
plays an important role to give many results faster and
accurate by using various algorithms.
A cluster is a collection of data objects that are similar to one
There are two primary goals for data mining prediction and
another within the same cluster and are dissimilar to the
description. Prediction involves fields or variables in the
objects in other clusters.
data sets to predict unknown or future values of other
By clustering we can identify dense and spare regions in
diseases possibilities. On the other hand description
object space and discover distribution patterns and
involves finding of pattern describing the data that can be
interesting correlations among data attributes. It means data
present in knowledge base provided for disease prediction.
segmentation.
We can predict diseases like hepatitis, Lung cancer liver
In earth observation, it helps in identification of areas of
disorder, breast cancer or heart diseases, diabetes etc,.
similar land use and identify group of houses in a city
We can use Naïve algorithm, Robin Karp algorithm, K-NN
according to house type and geographic location, etc.
algorithm and decision tree are most popular classifier
which are easy and simple to implement. They can handle
Prediction: The classification predicts categorical (discrete,
huge amount of dimensional data.
unordered) labels, prediction models continuous-valued

www.ijcsit.com 2024
Aarti Sharma et al, / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 5 (2) , 2014, 2023-2025

Example: we can use naïve algorithm to predict attributes It is built from the set of training objects with “divide and
like age, sex, blood pressure and blood sugar, changes of conquer” approach. If all objects are of same class decision
diabetes patient getting heart disease. tree consist of single node or leaf node. Otherwise, attribute
Naive algorithm is used to analyze alpha hemoglobin or beta node have at least two leaf nodes as growing decision tree.
hemoglobin in test of hemoglobin red blood cells. And it can For branch from that node the inducing procedure is
be used for DNA test. repeated upon the remaining objects regarding division or
Decision tree can be used to represent results in form of tree. output as leaf node comes.
Leaf nodes or internal nodes are labeled with values of There are many other techniques used to represent data in
attributes. Branches coming out from internal nodes are analyzing the results .
labeled with values of attributes in the node. This technique Such as:
is best suited for data mining in medicine or diseases  Genetic algorithms.
prediction.  Fuzzy sets.
Example:The finding of a solution with the help of decision  Neural networks.
trees starts by preparing a set of solved cases.[5 ]  Rough sets.
The whole set is then divided into 1) a training set, which is  Support vector machine(SVM)
used for the induction of a decision tree, and 2) a testing set, We can implement these techniques to classify member sets
which is used to check the accuracy of an obtained solution. of objects as either +ve or –ve results of test performed to
Each attribute can represent one internal node in a generated check fitness or illness of patient, these techniques are used
decision tree, also called an attribute node or a test node to extent the purpose to analyze the diseases with
(Fig-3). Such an attribute node has exactly as many branches multi-class decision making algorithms.
as its number of different value classes. The leaves of a
decision tree are decisions and represent the value classes of IV. CONCLUSION
the decision attribute – decision classes (Fig-3). Data mining is a “decision support” process in which we
The decision tree is very easy to interpret. For example, search for patterns of information in data. Data mining
from the tree shown in (Fig-3) wecan deduce the following techniques such as classification, clustering, prediction,
two rules: association and sequential patterns etc.
1. if the patient has inter-systolic noise and MCI and heart The commercial, educational and scientific applications are
malformations then she/he has a prolapse, and increasingly dependent on these methodologies.
2. if the patient has inter-systolic noise and MCI and no heart Decision trees are a reliable and effective decision making
malformations then she/he does not have a prolapse. technique which provide high classification accuracy with a
Here, the MCI and Pre-cordial Pain are attribute (test) nodes simple representation of collected KDD. It help experts to
in a growing decision tree and leaf nodes are the decision validate and classify the results and outcomes of tests and
nodes. analyze various new symptoms of diseases based on data.
Thus , data mining can help to play an important role in the
field of medicine or health care and disease prediction.

REFERENCES
(Journal papers):
[1]. Kalyani et al., International Journal of Advanced Research in Computer
Science and Software Engineering, ISSN: 2277 128X ,Volume 2, Issue 10,
October 2012 .
[2].Shalini Sharma, Vishal Shrivastava, International Journal on Recent
and Innovation Trends in Computing and Communication , ISSN 2321 –
8169 Volume: 1 Issue: 4, March 2013.
[3].Megha Gupta, Vishal Shrivastava, International Journal on Recent and
Innovation Trends in Computing and Communication, ISSN 2321 – 8169
Volume: 1 Issue: 8,August 2013.
[4]. S.Vijiyarani S.Sudha, Disease Prediction in Data Mining Technique –
A Survey, International Journal of Computer Applications & Information
Technology, ISSN: 2278-7720 Vol. II, Issue I, January 2013 .
[5].Vili Podgorelec, Peter Kokol, Bruno Stiglic, Ivan Rozman, Decision
trees: an overview and their use in medicine, Journal of Medical Systems,
Kluwer Academic/Plenum Press,Vol. 26, Num. 5, pp. 445-463, October
2002.
(Books):
Fig 3. An example of a (part of a) decision tree.[5] [6]. Han and Kamber, “Data Mining and Concepts”.

www.ijcsit.com 2025

IS352 - Lecture 01
No ratings yet
IS352 - Lecture 01
62 pages
AIML-HC Mod 02
No ratings yet
AIML-HC Mod 02
65 pages
Introduction Lecture1gghhhhh
No ratings yet
Introduction Lecture1gghhhhh
23 pages
RDBMS Syllabus
100% (1)
RDBMS Syllabus
1 page
DWM Merged
No ratings yet
DWM Merged
125 pages
Fujipress - JACIII 21 1 5
No ratings yet
Fujipress - JACIII 21 1 5
18 pages
ISS-DSS - Module 3
No ratings yet
ISS-DSS - Module 3
23 pages
DWDM Unit-II Notes
No ratings yet
DWDM Unit-II Notes
29 pages
Data Mining 4545
No ratings yet
Data Mining 4545
20 pages
Unit 1 DM
No ratings yet
Unit 1 DM
24 pages
DMiningKuliah 1 Introduction
No ratings yet
DMiningKuliah 1 Introduction
41 pages
DW and DM Notes
No ratings yet
DW and DM Notes
89 pages
Data Mining Techniques: By-Priyank Yadav CSE
No ratings yet
Data Mining Techniques: By-Priyank Yadav CSE
8 pages
Unit III DWDM
No ratings yet
Unit III DWDM
113 pages
Unit 1 Data Mining
No ratings yet
Unit 1 Data Mining
30 pages
Data Mining Mids
No ratings yet
Data Mining Mids
24 pages
Fundamentals of Data Science Unit 1
No ratings yet
Fundamentals of Data Science Unit 1
29 pages
Data Mining
No ratings yet
Data Mining
43 pages
Data Mining Notes
No ratings yet
Data Mining Notes
25 pages
Fundamentals of Data Science Notes (Module - 1)
No ratings yet
Fundamentals of Data Science Notes (Module - 1)
19 pages
DWDM Unit-2
No ratings yet
DWDM Unit-2
13 pages
Data Mining
No ratings yet
Data Mining
9 pages
DM Notes
No ratings yet
DM Notes
91 pages
Lecture 1428550844
No ratings yet
Lecture 1428550844
87 pages
Data Mining
No ratings yet
Data Mining
31 pages
Unit-4 DWM
No ratings yet
Unit-4 DWM
73 pages
Data Mining
No ratings yet
Data Mining
44 pages
Data Mining Implementation
No ratings yet
Data Mining Implementation
9 pages
DBMS Question Paper Jan Feb 2021
100% (1)
DBMS Question Paper Jan Feb 2021
1 page
TS-G17H TS-G17M GPS Tracker User Manual 202003
0% (1)
TS-G17H TS-G17M GPS Tracker User Manual 202003
8 pages
5 Data Mining Proccess and Techniques - Week 7
No ratings yet
5 Data Mining Proccess and Techniques - Week 7
61 pages
BI - Unit 5
No ratings yet
BI - Unit 5
9 pages
DWDM 2
No ratings yet
DWDM 2
15 pages
Data Warehouse and Mining Notes
No ratings yet
Data Warehouse and Mining Notes
12 pages
Data Mining
No ratings yet
Data Mining
20 pages
Data Mining
No ratings yet
Data Mining
25 pages
Survey Paper SN
No ratings yet
Survey Paper SN
4 pages
Data Mining Techniques and Applications
No ratings yet
Data Mining Techniques and Applications
16 pages
Unit 3 BI & Data Science
No ratings yet
Unit 3 BI & Data Science
19 pages
Unit - I
No ratings yet
Unit - I
22 pages
UNIT 1 - Lecture 1 - Introduction To Data Mining
No ratings yet
UNIT 1 - Lecture 1 - Introduction To Data Mining
62 pages
A Brief Overview On Data Mining Survey PDF
No ratings yet
A Brief Overview On Data Mining Survey PDF
8 pages
LED Beam 90W
50% (2)
LED Beam 90W
2 pages
Dbms Question Bank
100% (1)
Dbms Question Bank
28 pages
Information Technology: Osmania University Faculty of Business Management Computer Lab - Practical Question Bank
No ratings yet
Information Technology: Osmania University Faculty of Business Management Computer Lab - Practical Question Bank
6 pages
Data Mining - Prashant
No ratings yet
Data Mining - Prashant
10 pages
Data Mining System and Applications A Re
No ratings yet
Data Mining System and Applications A Re
13 pages
DM Unit1 Intro
No ratings yet
DM Unit1 Intro
12 pages
Knowledge Management UNIT-3 Notes
No ratings yet
Knowledge Management UNIT-3 Notes
17 pages
Data Mining Tutorials
No ratings yet
Data Mining Tutorials
52 pages
p144 Data Mining
100% (3)
p144 Data Mining
11 pages
Data Mining Process 1
No ratings yet
Data Mining Process 1
4 pages
Data Mining Applications and Feature Scope Survey
No ratings yet
Data Mining Applications and Feature Scope Survey
5 pages
DM Module1
No ratings yet
DM Module1
15 pages
Unit-Iv: Database Management System
50% (2)
Unit-Iv: Database Management System
8 pages
Unit I DM
No ratings yet
Unit I DM
27 pages
R18CSE4102-UNIT 2 Data Mining Notes
100% (1)
R18CSE4102-UNIT 2 Data Mining Notes
31 pages
cc15 2nd
No ratings yet
cc15 2nd
2 pages
TPW Data Mining
No ratings yet
TPW Data Mining
4 pages
Mis Syllabus
No ratings yet
Mis Syllabus
1 page
Data Mining Nostos
100% (1)
Data Mining Nostos
39 pages
Synopsis Print
No ratings yet
Synopsis Print
4 pages
How To Gather Data For Openshift OVN-Kubernetes
100% (1)
How To Gather Data For Openshift OVN-Kubernetes
8 pages
Unit 4 New Database Applications and Environments: by Bhupendra Singh Saud
No ratings yet
Unit 4 New Database Applications and Environments: by Bhupendra Singh Saud
14 pages
Chapter 6 Data Mining
No ratings yet
Chapter 6 Data Mining
39 pages
Data Mining Technique Using Weka Tool
No ratings yet
Data Mining Technique Using Weka Tool
21 pages
Data Mining Is Defined As The Procedure of Extracting Information From Huge Sets of Data
No ratings yet
Data Mining Is Defined As The Procedure of Extracting Information From Huge Sets of Data
6 pages
Mana Telugu Pack
No ratings yet
Mana Telugu Pack
1 page
CMMI
No ratings yet
CMMI
8 pages
Fybms FC 190 Deesha Mirwani
No ratings yet
Fybms FC 190 Deesha Mirwani
6 pages
Cybersecurity Engineer Resume Template
No ratings yet
Cybersecurity Engineer Resume Template
2 pages
BC0057 - Object Oriented Analysis and Design
No ratings yet
BC0057 - Object Oriented Analysis and Design
8 pages
Civil 3D Handling of Survey Points Practice Manual
No ratings yet
Civil 3D Handling of Survey Points Practice Manual
7 pages
Advantages of Database
No ratings yet
Advantages of Database
4 pages
University Grants Commission: National Educational Testing (Net) Bureau Notification
No ratings yet
University Grants Commission: National Educational Testing (Net) Bureau Notification
7 pages
Oracle Dev
No ratings yet
Oracle Dev
5 pages
Digital Image and Video Processing - 2013
No ratings yet
Digital Image and Video Processing - 2013
7 pages
Bcom
No ratings yet
Bcom
334 pages
Data Mining and Data Warehouse BY: Dept. of Computer Science Engineering
No ratings yet
Data Mining and Data Warehouse BY: Dept. of Computer Science Engineering
10 pages
Rubric S
No ratings yet
Rubric S
4 pages
Candidates From The State of Andhra Pradesh
No ratings yet
Candidates From The State of Andhra Pradesh
8 pages
Candidates From The State of Andhra Pradesh
No ratings yet
Candidates From The State of Andhra Pradesh
8 pages
Oracle 10g Installation Guide On Windows 7
No ratings yet
Oracle 10g Installation Guide On Windows 7
21 pages
Cello-IQ Integration Manual
No ratings yet
Cello-IQ Integration Manual
36 pages
CATALOGO PC e Monitor 2023 - EN LQ
No ratings yet
CATALOGO PC e Monitor 2023 - EN LQ
68 pages
Pod Hd500x 2.62 Update
No ratings yet
Pod Hd500x 2.62 Update
2 pages
2012 - SCADA Security in The Light of Cyber-Warfare
100% (1)
2012 - SCADA Security in The Light of Cyber-Warfare
19 pages
Javaii
No ratings yet
Javaii
13 pages
B.SC (Computer SC Ience) Object Oriente D Programming With Java and Data Structures Lab Programs-Data Structures
No ratings yet
B.SC (Computer SC Ience) Object Oriente D Programming With Java and Data Structures Lab Programs-Data Structures
31 pages
Data Sheet c78-502283
No ratings yet
Data Sheet c78-502283
5 pages
Information System Audit
No ratings yet
Information System Audit
6 pages
Core Middleware
No ratings yet
Core Middleware
52 pages
Lab 4
No ratings yet
Lab 4
3 pages
El Jarron Azul Resumen Corto
100% (1)
El Jarron Azul Resumen Corto
5 pages
Switch: The Solution Is Switching
No ratings yet
Switch: The Solution Is Switching
8 pages
Ai Proj SDR
No ratings yet
Ai Proj SDR
9 pages
Configurating SAML Authentication
No ratings yet
Configurating SAML Authentication
6 pages
Jayant Dalmia - Resume
No ratings yet
Jayant Dalmia - Resume
1 page
La Xlimit Ii: Manual
No ratings yet
La Xlimit Ii: Manual
7 pages
CST2410 Syllabus
No ratings yet
CST2410 Syllabus
3 pages
Adnane's Resume
No ratings yet
Adnane's Resume
1 page
SI3000 CCS Iskratel Reliable Platform For Critical Communications - Leaflet - en - Web
No ratings yet
SI3000 CCS Iskratel Reliable Platform For Critical Communications - Leaflet - en - Web
2 pages
Atorvastatin Compared With Simvastatin
No ratings yet
Atorvastatin Compared With Simvastatin
1 page
SIMATIC Field PG M2 - Guideline To The Operating Instructions
No ratings yet
SIMATIC Field PG M2 - Guideline To The Operating Instructions
1 page
Data Mining: Fundamentals and Applications
From Everand
Data Mining: Fundamentals and Applications
Fouad Sabry
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Application of Data Mining - A Survey Paper: Aarti Sharma, Rahul Sharma, Vivek Kr. Sharma, Vishal Shrivatava

Uploaded by

Application of Data Mining - A Survey Paper: Aarti Sharma, Rahul Sharma, Vivek Kr. Sharma, Vishal Shrivatava

Uploaded by

Aarti Sharma et al, / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol.

5 (2) , 2014, 2023-2025

Application of Data Mining – A Survey Paper

Abstract— Data mining is a powerful and a new field having

Keyword: Data mining, information prediction, raw data .

II. DATA MINING PROCESS

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.