0% found this document useful (0 votes)
16 views21 pages

Capstone Project

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views21 pages

Capstone Project

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 21

CAPSTONE PROJECT

CUSTOMER CHURN PREDICTOR

Batch ID: 17

Batch Members
● Ashutosh Das (8249198088)
● Suchismita Sahoo (7853937873)
● Yoshobanta Bisoi (7008209672)
● Sunay Lahiri (8847856525)
● Eski Palai (9348694971)
MOTIVATION & OBJECTIVE
MOTIVATION:
● Cost of Customer Acquisition vs. Retention: Retaining existing customers is more cost-effective
than acquiring new ones, saving businesses significant resources.
● Business Impact of Churn: High churn rates lead to revenue losses, making proactive churn
management essential for business stability.
● Actionable Insights: Predictive models help identify at-risk customers, enabling targeted retention
strategies to reduce churn.
● Scalability Across Industries: Churn prediction can be applied across sectors like telecom,
banking, and e-commerce, benefiting diverse businesses

OBJECTIVE:
● Predict Customer Churn: Develop a machine learning model to accurately classify
whether a customer is likely to churn or not.
● Analyze Key Factors: Identify the primary features contributing to customer churn for
actionable insights.
● Improve Business Retention Strategies: Provide businesses with data-driven
recommendations to reduce churn and enhance customer satisfaction.
● Model Performance Comparison: Evaluate and compare different machine learning
models to identify the most effective one for churn prediction.

Foundation For Innovation and Technology Transfer, IIT,


Delhi
2
INTRODUCTION

What is Customer Churn?


Customer churn refers to the loss of customers who stop using a company's
product or service over a period of time.
Why is Churn Prediction Important?
● Helps businesses identify at-risk customers and take proactive retention
measures.
● Reduces the costs associated with acquiring new customers by retaining
existing ones.

Relevance Across Industries:


Churn prediction is crucial in industries like telecom, banking, subscription-based
services, and e-commerce, where customer retention directly impacts profitability.
Role of Machine Learning:
Advanced machine learning techniques enable businesses to predict churn with
high accuracy and uncover hidden patterns in customer behavior.

Foundation For Innovation and Technology Transfer, IIT,


Delhi
3
LITERATURE REVIEW

● Customer Churn Prediction Using Machine Learning


Studies show that machine learning algorithms like logistic regression, decision trees, and
support vector machines are effective for predicting customer churn. These methods identify key
factors influencing churn, such as billing issues, customer support quality, and service usage
patterns.
● Feature Engineering for Churn Prediction
Research highlights the importance of deriving meaningful features, such as customer tenure,
payment history, and usage frequency, to improve model performance. Feature selection
techniques like mutual information and PCA (Principal Component Analysis) are often employed
to enhance accuracy.
● Deep Learning in Churn Prediction
Recent works explore deep learning models (e.g., neural networks) for handling large and
complex datasets. While they often yield better results, they require extensive computational
resources and tuning compared to traditional models.
● Business Applications of Churn Models
Predictive churn models have been implemented in sectors such as telecommunications,
banking, and e-commerce to design targeted retention strategies, yielding measurable
improvements in customer loyalty and revenue growth.
● Challenges and Limitations
○ Class imbalance in churn datasets, where non-churn cases vastly outnumber churn cases,
impacts prediction accuracy.
○ Overfitting can occur due to limited data or excessively complex models, necessitating
proper validation techniques.

Foundation For Innovation and Technology Transfer, IIT,


Delhi
4
DATASET USED

Source: Kaggle
Purpose: Customer Churn Prediction
Size and Structure:

● Rows: 7,043
● Columns: 21

Key Features:

● Demographics: Gender, SeniorCitizen


● Account Info: CustomerID, Tenure, Contract, PaperlessBilling, PaymentMethod
● Service Usage: MonthlyCharges, TotalCharges, InternetService, OnlineSecurity, OnlineBackup,
DeviceProtection, TechSupport, StreamingTV, StreamingMovies
● Target Variable: Churn (Yes/No)

Data Preprocessing:

● Missing Values: Imputed TotalCharges missing values with 0


● Encoding: One-hot encoded categorical features
● Scaling: Min-Max scaling applied to Tenure and MonthlyCharges

Foundation For Innovation and Technology Transfer, IIT,


Delhi
5
ARCHITECTURE DIAGRAM

Foundation For Innovation and Technology Transfer, IIT,


Delhi
6
METHODOLOGY

Data Preprocessing:

● Handling Missing Values:


Imputed missing values in the TotalCharges column with 0, as it represents total
charges which might be missing for customers who have just signed up.
● Encoding Categorical Variables:
Applied one-hot encoding to categorical features like Contract, PaymentMethod,
InternetService, etc., to convert them into numerical format suitable for machine
learning algorithms.
● Feature Scaling:
Used Min-Max scaling on numerical features like Tenure and MonthlyCharges to
bring them within the same scale and improve model performance.

Exploratory Data Analysis (EDA):

● Visualized distributions of key variables (e.g., churn vs. non-churn, monthly charges,
customer tenure) to understand patterns and relationships.
● Examined correlations between features to identify which factors contribute most to
customer churn.

Foundation For Innovation and Technology Transfer, IIT,


Delhi
7
METHODOLOGY

Model Selection:

● Applied multiple machine learning models, including:


○ Logistic Regression
○ Decision Tree Classifier
○ Random Forest Classifier
● Each model was trained using the training set and evaluated using the test set.

Model Evaluation:

● Evaluated models using performance metrics like Accuracy, Precision, Recall, F1-Score,
and ROC-AUC to select the best-performing model.
● Used Cross-Validation to prevent overfitting and ensure that the model generalizes well to
unseen data.

Prediction:

● The best-performing model was used to predict customer churn on new, unseen data.
Results were presented using a confusion matrix to evaluate classification accuracy.

Foundation For Innovation and Technology Transfer, IIT,


Delhi
8
RESULT & DISCUSSIONS

Foundation For Innovation and Technology Transfer, IIT,


Delhi
9
RESULT & DISCUSSIONS

Foundation For Innovation and Technology Transfer, IIT,


Delhi
10
RESULT & DISCUSSIONS

About 75% of customer with Month-to-Month Contract opted to move out as compared
to 13% of customers with One Year Contract and 3% with Two Year Contract

Foundation For Innovation and Technology Transfer, IIT,


Delhi
11
RESULT & DISCUSSIONS

Major customers who moved out were having Electronic Check as Payment Method.

Customers who opted for Credit-Card automatic transfer or Bank Automatic Transfer and Mailed Check
as Payment Method were less likely to move out.

Foundation For Innovation and Technology Transfer, IIT,


Delhi
12
RESULT & DISCUSSIONS

A lot of customers choose the Fiber optic service and it's also evident that the customers who use Fiber
optic have high churn rate, this might suggest a dissatisfaction with this type of internet service.

Customers having DSL service are majority in number and have less churn rate compared to Fibre optic
service.

Foundation For Innovation and Technology Transfer, IIT,


Delhi
13
RESULT & DISCUSSIONS

Foundation For Innovation and Technology Transfer, IIT,


Delhi
14
RESULT & DISCUSSIONS

Foundation For Innovation and Technology Transfer, IIT,


Delhi
15
SYSTEM SPECIFICATION
HARDWARE SPECIFICATION:
RAM :16GB
HARD DISK :512GB
PROCESSOR :Ryzen 5 5500U

SOFTWARE SPECIFICATION:
SOFTWARE TOOLS :Python 3.10
Jupyter Notebook / IDE (e.g., VSCode, PyCharm) for execution
PROGRAMMING LANGUAGE:Python

USED LIBRARY1: Pandas - For data manipulation and analysis


NumPy - For numerical computations
Matplotlib - For basic data visualizations
Seaborn - For advanced and aesthetically pleasing plots
Missingno - For visualizing missing data
Plotly - For interactive and dynamic visualizations
Scikit-learn - For preprocessing, machine learning models, and evaluation metrics
Foundation For Innovation and Technology Transfer, IIT,
Delhi
16
CONCLUSION
Model Performance:

● The machine learning models were evaluated using key metrics like accuracy and confusion matrix.
● The most accurate model successfully identified a majority of churn and non-churn customers, as
highlighted in the confusion matrix.

Key Observations:

● Non-Churn Prediction: The model correctly predicted 1,400 out of 1,549 actual non-churn values.
● Churn Prediction: For churn customers, 324 out of 561 were accurately identified.
● Challenges: Misclassification of 237 churn customers as non-churn remains a challenge that could be
improved with advanced techniques or additional data.

Business Implications:

● Customer Retention: By identifying at-risk customers, companies can design targeted retention
strategies, such as improving customer service or offering personalized incentives.
● Proactive Measures: Analyzing reasons for customer dissatisfaction can help businesses address
churn factors early and enhance customer satisfaction.
● Feedback and Loyalty: Surveys and loyalty programs can build stronger customer relationships and
reduce churn rates over time.

Future Scope:

● Explore deep learning techniques or ensemble models to further improve prediction accuracy.
● Perform feature engineering to capture additional factors influencing customer behavior.
Foundation For Innovation and Technology Transfer, IIT,
Delhi
17
SNAPSHOT

Foundation For Innovation and Technology Transfer, IIT,


Delhi
18
SNAPSHOT

Data Snapshot:

Paperless Payment MonthlyC TotalCharge


CustomerID Gender SeniorCitizen Tenure Contract Billing Method harges s Churn
Month-to- Electroni
7590-VHVEG Female 0 1 month Yes c check 29.85 29.85 No
Mailed
5575-GNVDE Male 0 34 One year No check 56.95 1889.5 No
Month-to- Mailed
3668-QPYBK Male 0 2 month Yes check 53.85 108.15 Yes
Bank
transfer
(automati
7795-CFOCW Male 0 45 One year No c) 42.3 1840.75 No
Month-to- Electroni
9237-HQITU Female 0 2 month Yes c check 70.7 151.65 Yes

Foundation For Innovation and Technology Transfer, IIT,


Delhi
19
SNAPSHOT

Foundation For Innovation and Technology Transfer, IIT,


Delhi
20
REFERENCES

● Bharti, P. (2023). Customer Churn Prediction using Machine Learning. Kaggle


Notebook.
● Han, J., Kamber, M., & Pei, J. (2011). Data Mining: Concepts and Techniques
(3rd ed.). Morgan Kaufmann Publishers.
● Kuhn, M., & Johnson, K. (2013). Applied Predictive Modeling. Springer.
● Pedregosa, F., Varoquaux, G., Gramfort, A., et al. (2011). Scikit-learn:
Machine Learning in Python. Journal of Machine Learning Research, 12,
2825–2830.
● Brownlee, J. (2020). Machine Learning Algorithms From Scratch with Python.
Machine Learning Mastery.
● Aggarwal, C. C. (2015). Data Mining: The Textbook. Springer.
● Gupta, A., & Kumar, V. (2021). Understanding Customer Churn Through
Predictive Analytics. Journal of Business Intelligence, 12(3), 45-59.
● IBM Cloud (2023). Churn Prediction for Businesses.

Foundation For Innovation and Technology Transfer, IIT,


Delhi
21

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy