0% found this document useful (0 votes)
19 views13 pages

Data Mining

The document discusses predictive and prescriptive analytics, focusing on data mining as a method for identifying patterns in large datasets to inform business strategies. It outlines the CRISP-DM methodology for data mining projects, detailing phases from business understanding to model deployment. Additionally, it highlights successes and failures in data mining applications across various industries, emphasizing the skills needed for effective data mining.

Uploaded by

abhijaychauhan88
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views13 pages

Data Mining

The document discusses predictive and prescriptive analytics, focusing on data mining as a method for identifying patterns in large datasets to inform business strategies. It outlines the CRISP-DM methodology for data mining projects, detailing phases from business understanding to model deployment. Additionally, it highlights successes and failures in data mining applications across various industries, emphasizing the skills needed for effective data mining.

Uploaded by

abhijaychauhan88
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 13

Unit 3

PREDICTIVE & PRESCRIPTIVE ANALYTICS


What is data mining?

Data mining is the process of searching and analyzing a large batch


of raw data in order to identify patterns and extract useful
information. Companies use data mining software to learn more about
their customers. It can help them to develop more effective
marketing strategies, increase sales, and decrease costs.
Data Mining Applications
Telecommunications:
Healthcare: •Network optimization
•Disease prediction and early diagnosis (e.g., •Customer churn prediction
cancer prediction) •Fraud detection
•Patient clustering for personalized treatment •Service quality management
•Drug discovery
•Electronic Health Record (EHR) analysis
Marketing and Customer Relationship
Management (CRM):
•Customer segmentation and targeting
•Predictive modeling for customer churn
•Market basket analysis (e.g., Amazon
recommending products)
Retail andlifetime
•Customer E-commerce:
value prediction
•Sales forecasting
•Inventory management
•Pricing strategy optimization
•Personalized recommendations
Strategy for Data Mining: CRISP-DM
CRISP-DM (Cross-Industry Standard Process for Data Mining) is a popular and well-
established methodology used to guide the data mining process. It provides a structured,
iterative framework for planning and executing data mining projects. CRISP-DM is flexible
and can be applied to any industry or domain.
Phase Description
Business Understanding Determine business objectives, assess the situation, and create a project plan

Data Understanding Collect, describe, explore, and verify data quality

Data Preparation Select, clean, construct, integrate, and format data

Modeling Select a modeling technique, generate a test design, build a model, and
assess the model

Evaluation Evaluate results, review the process, and determine next steps

Deployment Plan deployment, plan monitoring and maintenance, produce a final report,
and review the project
Stages in CRISP-DM
1.Business Understanding:
1. The first step focuses on understanding the business objectives and requirements. It includ
identifying the problem the organization is facing and formulating data mining goals to add
that problem.
2. Example activities: defining the project objectives, assessing the situation, defining success
criteria.
2.Data Understanding:
1. Once the business problem is understood, the next step involves collecting data from variou
sources and gaining a deep understanding of the data.
2. Example activities: data collection, data description, exploring data, identifying data quality
issues.
3.Data Preparation:
1. This stage involves transforming raw data into a format that can be used for modeling. It is
typically the most time-consuming step and includes data cleaning, integration, and
transformation.
2. Example activities: data selection, data cleaning, constructing new attributes, data
transformation.
5.Modeling:
1. In this phase, various modeling techniques are selected and applied to the data.
Models are fine-tuned to improve accuracy and performance.
2. Example activities: selecting modeling techniques, designing test cases, building
models, tuning parameters.
6.Evaluation:
3. After models are built, they are evaluated against the business objectives. This step
determines if the model meets the desired success criteria and if it is ready for
deployment.
4. Example activities: evaluating model performance, reviewing process, determining
the next steps.
7.Deployment:
5. The final stage is deploying the model to a production environment, making it
available for business use. This may involve generating reports, automating
predictions, or integrating the model into a decision-making system.
6. Example activities: deployment planning, monitoring, maintenance, creating final
reports.
•.

Life Cycle of a Data Mining Project


The data mining life cycle follows an iterative and cyclic pattern, closely aligning with the
CRISP-DM methodology. It can be broken down into several phases:
1.Problem Identification: Establishing the problem or business challenge.
2.Data Collection and Exploration: Gathering data and performing an exploratory
analysis to understand patterns, correlations, and anomalies.
3.Data Preparation: Cleaning and pre-processing the data for the modeling phase.
4.Model Development: Selecting the appropriate data mining techniques and developing
predictive, descriptive, or classification models.
5. Model Testing and Validation: Ensuring the model is accurate and reliable through testing on unseen
data
6. Model Deployment: Implementing the model in a real-world scenario, enabling decision-making based
on data insights.
7. Model Monitoring and Maintenance: Ensuring the model continues to perform accurately over time
and making adjustments if necessary.
Data-Mining Successes
Data mining has yielded many successful applications across various industries. Below are a
few notable examples:
Data Mining Successes
1.Targeted Marketing: Companies like Amazon and Netflix use data mining to recommend
products and movies to their users, leading to higher customer satisfaction and increased sales.
2.Fraud Detection: Banks and financial institutions successfully use data mining techniques to
detect and prevent fraudulent transactions.
3.Healthcare: Data mining is used to predict patient outcomes, improve treatment plans, and
assist in diagnosing diseases like cancer.
4.Customer Relationship Management (CRM):
1. Success Story: Retail giants like Amazon use data mining to personalize
recommendations. By analyzing customer purchase history and browsing patterns,
Amazon can suggest products that users are more likely to buy. This recommendation
system has increased sales and customer retention.
5.Fraud Detection:
1. Success Story: Financial institutions, such as banks and credit card companies, use
data mining to detect fraudulent transactions. For example, Visa uses data mining
algorithms to analyze transactional data in real-time to flag potential fraud, saving
millions of dollars in losses.
6.Healthcare:
1. Success Story: Hospitals and healthcare providers utilize data mining to predict disease
outbreaks, personalize patient care, and improve diagnostics. For example, IBM Watson
Data-Mining Failures
Despite many successes, data mining has its pitfalls and notable failures, often due to
poor implementation or ethical concerns.
Data Mining Failures
1.Overfitting: This occurs when a model is too complex and performs exceptionally
well on training data but poorly on new, unseen data. It fails to generalize, which can
result in inaccurate predictions in real-world applications.
2.Data Quality Issues: Poor quality data can lead to biased or incorrect models. For
example, missing or inaccurate data points could cause a model to make flawed
predictions.
3.Misalignment with Business Goals: Even if the model is technically sound, it can
fail if it doesn't align with the organization's business objectives. An example could be
focusing too much on precision without considering how predictions impact actual
business outcomes.
4.Google Flu Trends:
1. Failure Story: Google Flu Trends was a data-mining project designed to predict
flu outbreaks based on search queries. Initially promising, it eventually failed
because it overestimated flu prevalence by relying on search trends that weren't
always indicative of actual flu activity. This failure highlighted the importance of
combining data-driven insights with real-world data validation.
5.Facebook’s Emotion Manipulation Study:
Skills Needed for Data Mining
To succeed in data mining, professionals need a combination of technical, analytical, and
domain-specific skills:
1.Statistical and Mathematical Skills:
1. Knowledge of algorithms: Understanding key data mining algorithms (e.g.,
decision trees, clustering, regression) and their statistical foundations.
2. Probability and statistics: Essential for analyzing data, making predictions, and
validating models.
2.Programming and Software Skills:
1. Languages: Proficiency in programming languages such as Python, R, Java, or
SQL is essential for data manipulation and implementing algorithms.
2. Tools: Familiarity with data mining and machine learning tools such as Scikit-
learn, TensorFlow, Weka, SAS, or KNIME.
3.Database and Data Management Skills:
1. SQL/NoSQL databases: Ability to work with databases to extract and manipulate
MachinelargeLearning
datasets.and AI Knowledge:
 Supervised
2. Big Dataand unsupervised
technologies: learning:
Experience tools like Hadoop,
Understanding
with variousSpark,
machine NoSQL
or learning
databases
paradigms, including clustering,Cassandra)
(e.g., MongoDB, classification,
for and regression.
working with large, unstructured
 Deep learning: Experience with neural networks and frameworks like PyTorch and
datasets.
Keras can be crucial for advanced data mining tasks.
•Domain Knowledge:
 Business acumen: Understanding the business context is essential for translating data insights into
actionable business strategies.
 Specific industry knowledge: For instance, knowledge of healthcare, finance, or retail can help apply
data mining insights more effectively.
•Critical Thinking and Problem-Solving:
 Analytical mindset: Data miners must be able to think critically to identify patterns, trends, and
anomalies in data, and to develop strategies based on insights.
•Communication Skills:
 Data storytelling: The ability to convey insights effectively to non-technical stakeholders through
visualization tools (e.g., Tableau, Power BI) and clear reporting.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy