0% found this document useful (0 votes)

34 views11 pages

Customer Churn Prediction Capstone Projectdocx

The document discusses a capstone project focused on predicting customer churn using machine learning, specifically utilizing the Telco Customer Churn dataset. A Logistic Regression model was developed to identify customers likely to churn based on various features, achieving an accuracy of approximately 80%. The project concludes with actionable business recommendations to improve customer retention and suggests future work involving advanced modeling techniques.

Uploaded by

himanshu.tripathi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views11 pages

Customer Churn Prediction Capstone Projectdocx

Uploaded by

himanshu.tripathi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 11

Customer Churn Prediction Using

Machine Learning
Himanshu Tripathi
Reg.No – 22CBBBA014
BBA in Business Analytics
CMR University
Executive Summary
Customer churn represents a significant issue for businesses, especially
in highly competitive industries like telecommunications, SaaS, and
retail. Predicting customer churn allows companies to proactively
implement retention strategies that help maintain revenue, reduce
acquisition costs, and build long-term customer relationships.

In this capstone project, a machine learning-based approach is

implemented to predict whether a customer is likely to churn based on
historical behavioral and demographic data. The analysis uses the Telco
Customer Churn dataset, which includes a variety of features such as
service usage, contract type, tenure, payment methods, and more.

The methodology involves data cleaning, exploratory data analysis

(EDA), preprocessing, and modeling using Logistic Regression. The
model's performance is evaluated using accuracy, precision, recall, F1-
score, and confusion matrix. Insights derived from this analysis are then
translated into actionable business strategies.

Introduction
Customer churn refers to the loss of customers or subscribers, which
directly impacts the revenue and growth of a business. In today’s data-
driven world, companies are increasingly relying on predictive analytics
to anticipate and prevent churn.

The objective of this project is to create a predictive model using

machine learning to identify customers who are at risk of leaving. This
will help businesses act in advance by targeting those customers with
specific offers, improving services, or addressing potential concerns.

Predicting churn is not only cost-effective but also enhances customer

satisfaction by showing proactive engagement from the company.
Problem Statement
The goal is to predict customer churn using historical data provided by
a telecommunications company. The company wants to:

- Identify customers who are likely to stop using their services.

- Understand the key drivers of churn.
- Develop actionable strategies for customer retention.

The dataset consists of over 7,000 customer records and 21 features,

including demographic information, account details, and service usage
metrics.

Data Overview
Dataset Source:
- Dataset Name: Telco Customer Churn
- Source: Kaggle
- Link: https://www.kaggle.com/datasets/blastchar/telco-customer-
churn

Structure:
- Number of Rows: 7,043
- Number of Columns: 21
- Target Variable: Churn (Yes/No)

Sample Features:
- gender: Male, Female
- SeniorCitizen: 0 (No), 1 (Yes)
- Partner: Yes/No
- Dependents: Yes/No
- tenure: Number of months the customer has stayed
- PhoneService: Yes/No
- InternetService: DSL, Fiber optic, No
- Contract: Month-to-month, One year, Two year
- PaymentMethod: Electronic check, Mailed check, etc.
- MonthlyCharges and TotalCharges: Financial details

Data Cleaning & Preprocessing

Data Type Corrections:
- TotalCharges was identified as an object type and converted to
numeric.

Handling Missing Values:

- Null values found in TotalCharges were removed after conversion.

Encoding Categorical Variables:

- Target variable Churn was encoded: Yes = 1, No = 0
- Other categorical columns were one-hot encoded using
get_dummies().

Final Dataset Shape:

- After preprocessing, the dataset had no null values and all features
were numeric.

Exploratory Data Analysis (EDA)

Churn Distribution:
- Around 26.5% of customers churned.
- Class imbalance is moderate and manageable.

Churn by Contract Type:

- Customers with Month-to-month contracts had the highest churn rate.

Churn by Tenure:
- Customers with low tenure (0-12 months) showed a high churn
tendency.

Churn by Monthly Charges:

- Customers with higher monthly charges were more likely to churn.
Correlation Matrix:
- A heatmap was generated to understand feature correlations.
- Tenure was negatively correlated with churn.

Model Building
Model Chosen:
- Logistic Regression: Chosen for its simplicity and interpretability.

Train-Test Split:
- 80% training, 20% testing

Model Training:
The logistic regression model was trained using scikit-learn with
max_iter set to 1000 for better convergence.

Model Prediction:
After training, predictions were made on the test dataset to evaluate
performance.

Model Evaluation
Confusion Matrix:
- Shows True Positives, True Negatives, False Positives, and False
Negatives.

Classification Report:
- Accuracy: ~80%
- Precision: Indicates correctness of positive predictions
- Recall: Indicates coverage of actual positives
- F1-Score: Balance between precision and recall

The Logistic Regression model provided a good baseline. For future

improvement, models like Random Forest, XGBoost, or ensemble
methods can be tested.
Insights & Business Recommendations
Key Insights:
- Customers on month-to-month contracts are more likely to churn.
- High monthly charges and lower tenure also indicate higher churn
risk.
- Customers without internet service or tech support showed reduced
engagement.

Business Recommendations:
1. Incentivize Long-Term Contracts: Offer discounts for switching to
yearly plans.
2. Targeted Retention Campaigns: Focus on customers with low tenure
and high charges.
3. Improve Customer Support: Ensure tech support is prompt and
helpful.
4. Bundles & Loyalty Programs: Offer bundle discounts on internet +
phone service.

Conclusion
Customer churn is a critical metric for business success. Through this
project, a predictive machine learning model was successfully built to
identify customers likely to churn.

With proper data preparation, feature engineering, and model training,

the logistic regression model achieved an acceptable level of
performance.

Future work will include testing with more advanced algorithms,

integrating additional customer feedback data, and deploying the model
into a production environment for real-time monitoring.
References
1. Telco Customer Churn Dataset -
https://www.kaggle.com/datasets/blastchar/telco-customer-churn
2. Scikit-learn Documentation - https://scikit-learn.org/
3. Python Data Analysis Library (Pandas) - https://pandas.pydata.org/
4. Seaborn Documentation - https://seaborn.pydata.org/
5. Matplotlib Documentation - https://matplotlib.org/
Modeling Code Snippets
Data Preprocessing
# Convert TotalCharges to numeric
df['TotalCharges'] = pd.to_numeric(df['TotalCharges'], errors='coerce')
df.dropna(inplace=True)

# Encode Churn
df['Churn'] = df['Churn'].map({'Yes': 1, 'No': 0})

# One-hot encode categorical variables

df = pd.get_dummies(df)

Train-Test Split
from sklearn.model_selection import train_test_split

X = df.drop('Churn', axis=1)
y = df['Churn']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Logistic Regression Model

from sklearn.linear_model import LogisticRegression

model = LogisticRegression(max_iter=1000)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

Evaluation Metrics

from sklearn.metrics import classification_report, confusion_matrix

print(confusion_matrix(y_test, y_pred))
print(classification_report(y_test, y_pred))
Visualizations
These charts illustrate the distribution and relationship of key features with churn.

Customer Churn Distribution

Churn Rate by Contract Type

Monthly Charges vs Churn

Journal Pone 0278095
No ratings yet
Journal Pone 0278095
21 pages
Customer Churn Presentation
No ratings yet
Customer Churn Presentation
28 pages
Toyota 5l
100% (2)
Toyota 5l
79 pages
Customer Churn Telecom
No ratings yet
Customer Churn Telecom
35 pages
Project Report
No ratings yet
Project Report
83 pages
Speech F
No ratings yet
Speech F
16 pages
Capstone Project
No ratings yet
Capstone Project
21 pages
Concept Note - Chhandavi Gowardhan
No ratings yet
Concept Note - Chhandavi Gowardhan
2 pages
DSS 2 Draft
No ratings yet
DSS 2 Draft
33 pages
Wa0001.
No ratings yet
Wa0001.
11 pages
Customer Churn Presentation
No ratings yet
Customer Churn Presentation
10 pages
Phase 3
No ratings yet
Phase 3
16 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
15 pages
DataScience Project-New
No ratings yet
DataScience Project-New
16 pages
Data Science Case Report
No ratings yet
Data Science Case Report
20 pages
Iranian Churn
No ratings yet
Iranian Churn
16 pages
Classification Report Telco
No ratings yet
Classification Report Telco
2 pages
Varshini Phase 3
No ratings yet
Varshini Phase 3
12 pages
Phase 1
No ratings yet
Phase 1
2 pages
Vig SPS-5382-Telecom Customer Churn Prediction Using Watson Auto AI
No ratings yet
Vig SPS-5382-Telecom Customer Churn Prediction Using Watson Auto AI
51 pages
SPS-5382-Telecom Customer Churn Prediction Using Watson Auto AI
No ratings yet
SPS-5382-Telecom Customer Churn Prediction Using Watson Auto AI
51 pages
Token ID Ain20250117003-1
No ratings yet
Token ID Ain20250117003-1
14 pages
Customer Churn Prediction in Telecom Sector Using Machine Learning Techniques
No ratings yet
Customer Churn Prediction in Telecom Sector Using Machine Learning Techniques
16 pages
Group 13 - Analyzing Customer Churn
No ratings yet
Group 13 - Analyzing Customer Churn
6 pages
Telecom Customer Churn Report
No ratings yet
Telecom Customer Churn Report
3 pages
Phase 3
No ratings yet
Phase 3
12 pages
Output 4
No ratings yet
Output 4
5 pages
Finalized Version
No ratings yet
Finalized Version
16 pages
Customer Churn Prediction Capstone Himanshu
No ratings yet
Customer Churn Prediction Capstone Himanshu
5 pages
Problem Statement
No ratings yet
Problem Statement
2 pages
Research Churn
No ratings yet
Research Churn
4 pages
Customer Churn Prediction Using Machine Learning
No ratings yet
Customer Churn Prediction Using Machine Learning
7 pages
Abhishekj Uvatkar
No ratings yet
Abhishekj Uvatkar
4 pages
Telecom Customer Churn
No ratings yet
Telecom Customer Churn
5 pages
ML Customer Churn Case Study
No ratings yet
ML Customer Churn Case Study
4 pages
Predictive Analytics Strategy
No ratings yet
Predictive Analytics Strategy
4 pages
Churn Prediction Product Idea
No ratings yet
Churn Prediction Product Idea
7 pages
Capstone
No ratings yet
Capstone
1 page
Predictive Analytics Project
No ratings yet
Predictive Analytics Project
13 pages
Churn Prediction in Telecom Using Machine Learning in R
No ratings yet
Churn Prediction in Telecom Using Machine Learning in R
9 pages
Algorithms 17 00231
No ratings yet
Algorithms 17 00231
21 pages
Grade Project
No ratings yet
Grade Project
1 page
Project Report
No ratings yet
Project Report
11 pages
Interim Report
No ratings yet
Interim Report
17 pages
Predictive Analytics Customer Churn
No ratings yet
Predictive Analytics Customer Churn
3 pages
Synopsis
No ratings yet
Synopsis
3 pages
Paper Published
No ratings yet
Paper Published
5 pages
Customer Churn Prediction Using Machine Learning Algorithms
No ratings yet
Customer Churn Prediction Using Machine Learning Algorithms
6 pages
Synopsis Major Project
No ratings yet
Synopsis Major Project
8 pages
Predicting Churn Customer in Telecom Using Peergrading Regression Learning Technique
No ratings yet
Predicting Churn Customer in Telecom Using Peergrading Regression Learning Technique
13 pages
Customer Churn Prediction
No ratings yet
Customer Churn Prediction
5 pages
Anticipating Customer Churn in Telecommunication Using Machine Learning Algorithms For Customer Retention
No ratings yet
Anticipating Customer Churn in Telecommunication Using Machine Learning Algorithms For Customer Retention
7 pages
Efficacy of Customer Churn Prediction System
No ratings yet
Efficacy of Customer Churn Prediction System
8 pages
A Survey On Customer Churn Prediction in
No ratings yet
A Survey On Customer Churn Prediction in
6 pages
DWDM Cep
No ratings yet
DWDM Cep
13 pages
Analysis of Customer Churn Prediction in Telecom Industry Using Decision Trees and Logistic Regression
No ratings yet
Analysis of Customer Churn Prediction in Telecom Industry Using Decision Trees and Logistic Regression
4 pages
System Engineering A Management Perspective
100% (2)
System Engineering A Management Perspective
22 pages
Customer Churn Prediction
No ratings yet
Customer Churn Prediction
8 pages
12622-Article Text-22383-1-10-20220510
No ratings yet
12622-Article Text-22383-1-10-20220510
5 pages
Customer Churn Prediction Using Machine Learning: D. Deepika, Nihal Chandra
100% (1)
Customer Churn Prediction Using Machine Learning: D. Deepika, Nihal Chandra
14 pages
A Workbook in Lexical Semantics
No ratings yet
A Workbook in Lexical Semantics
35 pages
Customer Churn Analysis and Prediction
No ratings yet
Customer Churn Analysis and Prediction
4 pages
A Conceptual Overview of Data Mining: B.N. Lakshmi., G.H. Raghunandhan
No ratings yet
A Conceptual Overview of Data Mining: B.N. Lakshmi., G.H. Raghunandhan
6 pages
Nokia V Sim
No ratings yet
Nokia V Sim
114 pages
Mat112 - Chapter 1 - Review On Algebra
No ratings yet
Mat112 - Chapter 1 - Review On Algebra
16 pages
HR
No ratings yet
HR
7 pages
PrE4 Module 1
No ratings yet
PrE4 Module 1
8 pages
Data Exam 3
No ratings yet
Data Exam 3
42 pages
BC 278clt
No ratings yet
BC 278clt
44 pages
KELOMPOK 5 - An Overview of Business Intelligence, Analytics, and Data Science
No ratings yet
KELOMPOK 5 - An Overview of Business Intelligence, Analytics, and Data Science
15 pages
Professional Certificates Catalog
No ratings yet
Professional Certificates Catalog
60 pages
Eproc Tenders
No ratings yet
Eproc Tenders
104 pages
L2 - Multiprocessor System
No ratings yet
L2 - Multiprocessor System
24 pages
IBM I 7.1 System MGMT - Performance Ref Info
No ratings yet
IBM I 7.1 System MGMT - Performance Ref Info
278 pages
HF-3 Instruction Manual
No ratings yet
HF-3 Instruction Manual
11 pages
2015 Summer Model Answer Paper
No ratings yet
2015 Summer Model Answer Paper
40 pages
IMS Brochure
No ratings yet
IMS Brochure
11 pages
2023 07 14-17 36 32
No ratings yet
2023 07 14-17 36 32
25 pages
Vernier Labquest 2 Manual Original
No ratings yet
Vernier Labquest 2 Manual Original
62 pages
Ritesh M 06122022
No ratings yet
Ritesh M 06122022
11 pages
Virtualized Research Environments On Bwforcluster Nemo: Zki Arbeitskreis Supercomputing, 17.03.2017, Duisburg
No ratings yet
Virtualized Research Environments On Bwforcluster Nemo: Zki Arbeitskreis Supercomputing, 17.03.2017, Duisburg
18 pages
Building A Reference Model For Anti-Money Laundering in The Financial Sector
No ratings yet
Building A Reference Model For Anti-Money Laundering in The Financial Sector
10 pages
Blackmagic Camera Post Workflow - Hurlbut Visuals
No ratings yet
Blackmagic Camera Post Workflow - Hurlbut Visuals
7 pages
Oxford Learner's Bookshelf E-Books For Learning English 2
No ratings yet
Oxford Learner's Bookshelf E-Books For Learning English 2
1 page
Nerf in Digital Twin
No ratings yet
Nerf in Digital Twin
16 pages
Bhavna Interiors Details
No ratings yet
Bhavna Interiors Details
11 pages
ECE 546 - VLSI Systems Design Lecture 16: SRAM: Fall 2012 W. Rhett Davis NC State University
No ratings yet
ECE 546 - VLSI Systems Design Lecture 16: SRAM: Fall 2012 W. Rhett Davis NC State University
24 pages
Project Report On DVR (17001005025,2056,2046)
No ratings yet
Project Report On DVR (17001005025,2056,2046)
51 pages
Aw E-book - คู่มือการใช้งาน SOLIDWORKS
No ratings yet
Aw E-book - คู่มือการใช้งาน SOLIDWORKS
4 pages
CIW Data Analyst Exam Prep: 500 Practice Questions for Certification Success
From Everand
CIW Data Analyst Exam Prep: 500 Practice Questions for Certification Success
Steve Brown
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Customer Churn Prediction Capstone Projectdocx

Uploaded by

Customer Churn Prediction Capstone Projectdocx

Uploaded by

Customer Churn Prediction Using

In this capstone project, a machine learning-based approach is

The methodology involves data cleaning, exploratory data analysis

The objective of this project is to create a predictive model using

Predicting churn is not only cost-effective but also enhances customer

- Identify customers who are likely to stop using their services.

The dataset consists of over 7,000 customer records and 21 features,

Data Cleaning & Preprocessing

Handling Missing Values:

Encoding Categorical Variables:

Final Dataset Shape:

Exploratory Data Analysis (EDA)

Churn by Contract Type:

Churn by Monthly Charges:

The Logistic Regression model provided a good baseline. For future

With proper data preparation, feature engineering, and model training,

Future work will include testing with more advanced algorithms,

# One-hot encode categorical variables

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Logistic Regression Model

from sklearn.metrics import classification_report, confusion_matrix

Customer Churn Distribution

Churn Rate by Contract Type

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.