Phase-2 Ibrahim
Phase-2 Ibrahim
1. Problem Statement
2. Project Objectives
• Build machine learning models that can classify whether a customer will
churn.
Data Collection
Future Engineering
4. Data Description
• Missing Values:
• Duplicate Records:
• Outliers:
• Categorical Encoding:
• Feature Scaling:
• Univariate Analysis:
• Bivariate/Multivariate Analysis:
• Insights Summary:
2-year) have lower churn rates o Services like tech support and online
7. Feature Engineering
New Features Created:
• TenureGroup: Categorized tenure into "0–12", "12–24", etc.
Transformed Features:
Dimensionality Reduction:
Feature Selection:
8. Model Building
• Train/Test Split:
o 80% training, 20% testing; stratified on target to maintain class
balance
• Models Implemented:
o Logistic Regression: Baseline model
o Random Forest Classifier: Non-linear model for improved
performance
• Evaluation Metrics:
o Accuracy, Precision, Recall, F1-Score, ROC AUC
F1-
Model Accuracy Precision Recall AUC
Used: