0% found this document useful (0 votes)
12 views9 pages

IDW Lecture 31 - Basic Concepts About Data Mining

Data mining involves analyzing large datasets from various domains to extract useful information and knowledge. It encompasses techniques such as classification, clustering, and association analysis, which help in discovering patterns and trends in data. Major issues in data mining include security concerns, user interface challenges, and performance limitations when dealing with large datasets.

Uploaded by

talha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views9 pages

IDW Lecture 31 - Basic Concepts About Data Mining

Data mining involves analyzing large datasets from various domains to extract useful information and knowledge. It encompasses techniques such as classification, clustering, and association analysis, which help in discovering patterns and trends in data. Major issues in data mining include security concerns, user interface challenges, and performance limitations when dealing with large datasets.

Uploaded by

talha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Introduction to Data

Warehouse
Lecture # 31
Basic concepts about Data Mining
Data Mining
Huge amount of data is available from
different domains.
That data can be analyzed to get the
knowledge.
Data mining helps the end users to extract
useful information from large databases.
Data warehousing allows to build data
mountains.
Data mining is technique to extract
previously un known knowledge from the
data mountains.
Data mining is also referred as knowledge
Data collections and data
availability
Business Transactions.
Scientific Data.
Medical and Personal data.
Surveillance video and pictures.
Games.
Text reports and memos (E-mail messages)
World wide web repositories.
Data that can be mined
Flat files
Relational databases.
Data warehouses.
Transaction databases.
Multimedia databases.
Spatial databases.
Time series databases.
World wide web.
Data mining: what can be
discovered?
Descriptive data mining
Predictive data mining
Data mining functionalities:
Characterization: summarization of general features
of an objects in a target class.
Discrimination: comparison of general features of
objects between two classes.
Association analysis: discovery of association rules.
Study of items occurring together in a transactional
database and based on a threshold which is called
support. Another thresh hold called confidence
which represents conditional probability that an item
appears in a transaction when other item appears.
Data mining: what can be
discovered?
 Classification: organization of data in given classes.
Supervised classification: uses given class labels to order
the objects in the data collections.
 Prediction: forecasting related to some data.
Prediction of some un available data values or pending
trends.
Predict a class label from some data.
 Clustering: Organization of data into classes. Class
labels are un known and algorithm discovers acceptable
classes.
 Clustering approaches are based on maximizing
similarity of objects in same class (intra class similarity)
and minimizing similarity between objects of different
classes (inter class similarity)
Data mining: what can be
done?
Outlier analysis: data elements that cannot
be grouped in a given class or cluster.
They are also called exceptions and
surprises.
They can reveal important knowledge in
other domains.
Evolution and deviation analysis: study of
time related data that changes in time.
Models evolutionary trends in data.
Deviation analysis considers difference
between measured values and expected
values.
Classification of data mining
systems
Type of data source mined.
Spatial, multimedia, time series, text data.
Data model based.
Relational, object oriented, data warehouse,
transactional.
Kind of knowledge discovered.
Characterization, association, clustering.
Mining technique used.
Machine learning
Neural networks.
Query driven systems.
Major issues in data mining.
Security and social issues: information collected
related to data mining which is related to persons
may be of private nature.
User interface issues: screen real estate, rendering
and interaction with user.
Mining methodology: related to data mining
approach limitations, structure of data passed to
technique.
Performance Issues: techniques related to AI and
statistical methods are not designed for large data
set analysis.
Data source issues: diversity of data sources, data
present in sources and sparseness of data.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy