0% found this document useful (0 votes)
37 views13 pages

Data Mining: V Mounika Revathi Dept of Cse Sitam

Data mining involves applying techniques from statistics and artificial intelligence to uncover hidden patterns in large data sets. As data sets have increased in size and complexity, automated data analysis methods have become more important. Data mining bridges applied statistics and database management by exploiting how data is stored in databases to efficiently apply learning algorithms to large data sets. It involves tasks like classification, clustering, and association analysis to organize data, discover patterns, and make predictions.

Uploaded by

mounika
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views13 pages

Data Mining: V Mounika Revathi Dept of Cse Sitam

Data mining involves applying techniques from statistics and artificial intelligence to uncover hidden patterns in large data sets. As data sets have increased in size and complexity, automated data analysis methods have become more important. Data mining bridges applied statistics and database management by exploiting how data is stored in databases to efficiently apply learning algorithms to large data sets. It involves tasks like classification, clustering, and association analysis to organize data, discover patterns, and make predictions.

Uploaded by

mounika
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 13

Data Mining

V MOUNIKA REVATHI
DEPT OF CSE
SITAM
Background
 The manual extraction of patterns from data has occurred for centuries. Early
methods of identifying patterns in data include Bayes’ Theorem (1700s)
and Regression Analysis (1800s).
 As datasets have grown in size and complexity, direct "hands-on" data analysis
has increasingly been augmented with indirect, automated data processing, aided
by other discoveries in computer science, such as Neural Networks, Cluster
Analysis, genetic algorithms (1950s), decision trees (1960s), and support vector
machines (1990s).
 Data mining is the process of applying these methods with the intention of
uncovering hidden patterns in large data sets.
 It bridges the gap from applied statistics and artificial intelligence (which usually
provide the mathematical background) to database management by exploiting the
way data is stored and indexed in databases to execute the actual learning and
discovery algorithms more efficiently, allowing such methods to be applied to
ever larger data sets.
SYLLABUS
I. INTRODUCTION
II. PRE-PROCESSING
III. CLASSIFICATION : BASIC CONCEPTS
IV. CLASSIFICATION : ALTERNATIVE TECHNIQUES
V. ASSOCIATION ANALYSIS
VI. CLUSTER ANALYSIS
Introduction
 Data is produced at a phenomenal rate
 Our ability to store has grown
 Users expect more sophisticated information
 How?
UNCOVER HIDDEN INFORMATION
DATA MINING

LIET 4
Query Examples
 Database
 Find all credit applicants with last name of Smith.
 Identify customers who have purchased more than $10,000 in the last
month.
 Find all customers who have purchased milk

 Data Mining
 Find all credit applicants who are poor credit risks. (classification)
 Identify customers with similar buying habits. (Clustering)
 Find all items which are frequently purchased with milk. (association
rules)

LIET 5
Data Mining Models and Tasks

LIET 6
Basic Data Mining Tasks
 Classification maps data into predefined groups or
classes
 Supervised learning
 Pattern recognition
 Prediction
 Regression is used to map a data item to a real
valued prediction variable.
 Clustering groups similar data together into clusters.
 Unsupervised learning
 Segmentation
 Partitioning
LIET 7
Basic Data Mining Tasks (cont’d)
 Summarization maps data into subsets with
associated simple descriptions.
 Characterization
 Generalization
 Link Analysis uncovers relationships among data.
 Affinity Analysis
 Association Rules
 Sequential Analysis determines sequential patterns.

LIET 8
Social Implications of DM
 Privacy
 Profiling
 Unauthorized use

LIET 9
The Scope of DATA MINING
 Given databases of sufficient size and quality, data mining
technology can generate new business opportunities by
providing these capabilities:
 Automated prediction of trends and behaviors
 Automated discovery of previously unknown patterns
 Data mining techniques can yield the benefits of automation
on existing software and hardware platforms, and can be
implemented on new systems as existing platforms are
upgraded and new products developed.
 A recent Gartner Group Advanced Technology Research Note
listed data mining and artificial intelligence at the top of the
five key technology areas. LIET 10
Applications & Research Scope
 As data mining matures, new and increasingly innovative
applications for it emerge.
 Although a wide variety of data mining scenarios can be
described, the applications of data mining are divided in the
following categories:
 Healthcare
 Finance
 Retail industry
 Telecommunication
 Text Mining & Web Mining
 Higher Education
LIET 11
Course Objectives
 Students will be enabled to understand and implement
classical models and algorithms in data warehousing and data
mining.
 They will learn how to analyze the data, identify the
problems, and choose the relevant models and algorithms to
apply.
 They will further be able to assess the strengths and
weaknesses of various methods and algorithms and to analyze
their behavior.

LIET 12
Course Outcomes
 Understand why there is a need for data warehouse in addition
to traditional operational database systems.
 Identify components in typical data warehouse architectures.
 Design a data warehouse and understand the process required to
construct one.
 Understand why there is a need for data mining and in what
ways it is different from traditional statistical techniques.
 Understand the details of different algorithms made available by
popular commercial data mining software.
 Solve real data mining problems by using the right tools to find
interesting patterns.
LIET 13

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy