Mini-Project - Churn Analysis .
Mini-Project - Churn Analysis .
➢ PROBLEM STATEMENT:
Businesses must compete fiercely to win over new consumers from suppliers. Since it directly
affects a company’s revenue, client retention is a hot topic for analysis, and early detection of
client churn enables businesses to take proactive measures to keep customers. As a result, all firms
could practice a variety of approaches to identify their clients early on through client retention
initiatives.Churning customers are sudden and problematic in business sense.A UK-based and
registered non-store online retail company mainly sells unique all-occasion gifts. Many customers
of the company are wholesalers.As such, a customer can decide to terminate their services at any
time. This makes it more difficult to intuitively understand when a customer might churn.
➢ GOAL
The desired outcome is to analyze consumers’ behaviors and predict what they might do in the
future.
➢ CUSTOMER CHURN
To the untrained eye, customer behavior is difficult to predict. After all, they are humans with
erratic whims and desires. However, to a machine that can compute thousands of things a second,
trends and patterns are increasingly obvious. Businesses aim to engage with customers in a way
that they return to the store repeatedly, generating revenue each time. However, it can be
challenging to determine which customers are likely to return, and which customers have lost
interest in the goods or services being provided.
Customer Churn: A customer is considered churn-ing if they are actively returning to the store.
Whereas a churn-ed customer is one that is no longer coming back for more.
Customer Churn Risk is the probability that a customer will disengage with the business.
Understanding how your customers behave is imperative to make the most of their
patronage. Today, we can leverage the volumes of data available to us, to predict how likely
a customer is to continue engaging with your business. This can be valuable in the following
ways:
Shape of data before and after removing NaN's Customer_ID is (541909,8) and
(406829,8) respectively. There were 311640 missing records from the Date column ,
removed. Only features required are Date , Customer_ID , transaction value , transaction
time. Basic info of such columns :
Quantity column was cleaned .
VISUALIZATION
No of unique Customers and products in dataset is 2185 and 2881 . Date Range is
decided .
STEP 2 : FEATURE ENGINEERING
RFM is a method of quantifying customers in a meaningful way and can serve as a good
baseline when it comes to performing any analytics on customer specific transactional
data.
Recency, Frequency and Monetary value capture when the customer made their most
recent transaction, how often they have returned for business and what the average sale
was for each customer. We can add on to this by using any other available features (like
GrossMargin, Age, CostToRetain) or other predicted features (Lifetime Value) or
Sentiment Analysis).
The way it works is that we can split the training data into an observed period and a
future period. If we want to predict how much a customer will spend in a year, we would
set the length of the future period as one year, and the rest would come under observed.
This allows us to fit a model to classify which customers engaged with the business in the
future period using features computed in the observed period. Here we introduce the
concept of the cut-off. This is simply where the observed period ends, and defines before
what date we should calculate our features.
● Age: Time since first transaction. For this feature we will simply find the number of
days since each customers first transaction. Again, we will need a cut off to calculate the
time between the cut off and the first transaction.
Ideally, this can capture information about customer retention within a certain time
period. This might look something like this:
For the labels we would just set 1 for those who bought something in the future period,
and 0 for everyone who didn’t.
Recursive RFM
Let us apply what we know of RFM thus far and loop it through the dataset.
Let's say the data begins on the left at the beginning of a year. We’ll select a
frequency (for example, one month) and iterate through the data set, computing our
features from observed (o) and generate our labels from future (f). The idea is to
recursively compute these features in order for the model to learn how customers’
behavior changes over time.
Now that we have generated our dataset, all we need to do now is shuffle and perform a
train/test split on our data. We’ll use 80% for training and 20% for testing.
Class Imbalance
In a classification task, sometimes the classes we want to predict are imbalanced in the
data set. For example, if there are 10 observations and two classes; 2 of them may be in
Class_0 and the other 8 are in Class_1. This could introduce bias into the model as it sees
significantly more of one class than the other. We define the minority class as the one
with fewer observations, and the majority class as the one with more observations. In our
tutorial, this would like something like this:
➢ STEP 3 : MODEL
For this example we will try a Random Forest Classifier, as they are very
plug-and-play in their implementation, and so they are very easy to try straight
away.
OUTPUT :
➢ RESULTS
It’s interesting that both data sets produced very similar results, in fact the over sampled
data performed worse than the imbalanced data. Here we can look at the classification
report to see how precise the predictions actually were.
Classification report for predictions on imbalanced data
Now that our model has been trained, we can use the predict_proba()function to get the
probabilities associated with each prediction. Here is a plot of the predicted probability
distribution. Remember, the probability predicted by the model is how likely a customer
is to engage with the business, and we are looking for the probability that they won’t, so
we can simply subtract each probability from 1.
Histogram of probability distribution of churn risk among customers
As expected, most customers are on either end of the spectrum. However, the most
meaningful and actionable insights are found between them. Customers lying before 0.5
are at a low risk of disengaging, and so this plot indicates that most customers are
healthy. On the other hand, those with a churn risk of over 0.5 are more likely to
disengage and so paying attention to their preferences are imperative to retain them.
➢ Conclusion
Feature engineering techniques like Recursive RFM allow for rich features to describe
customers. As seen here, these features can be useful to analyze their behaviors and
predict what they might do in the future. We also covered how to handle class imbalance
if necessary using SMOTE. Churn Risk is just one of these predictable metrics. Others
include Customer Lifetime Value and Customer Segmentation. What’s special about
Churn Risk, is that it taken be taken a step further to identify the probability the
customers will do something more specific, like buy a specific category of product, or
likelihood of engaging on each day of the week. The potential of customer analytics is
far-reaching and ever-insightful, especially for businesses.