Customers leaving your Organization? Deep Learning will help you, Churn Rate?
#Hands on implementation in Python to find out Churn Rate
- What is Churn rate?
- Which Dataset are we using?
- Python implementation
What is Churn rate?
“The annual percentage rate at which customers stop subscribing to a service or employees leave a job.” — Google search result
Churn is defined slightly differently by each organization or product. Generally, the customers who stop using a product or service for a given period of time are referred to as churners.
These churners are in every field, For example- People leaving any organization and cost of replacing those employees could be quite large. So by doing analysis, understanding why and when employees are most likely to leave can lead to actions to improve employee retention as well as possibly planning new hiring in advance these kinds of analysis comes under HR analytics and People analytics.
For any companies where customers are paying them on monthly bases like SAAS or other subscription-based companies, Churn rate is critically important for them.
What kind of Dataset are we Using?
The sample dataset, which we are going to use is of a bank and that bank has their presence across Europe. It has 10,000 users and their detail.
It’s a sample Dataset which I found on the web from superdatascience.
But trust me you can build and implement this model to any kind of organization. In the given dataset, we have multiple independent variables, like credit score, salary, location. And one dependent variable (Yes/No, 1/0) whether a person leaves a bank or not. For any dataset where you have multiple independent variables and one binary outcome, you can implement this model.
Here we will only implement very simple ANN, but if we do next level analysis then we can tell which variables are crucial in the dataset. Like in given dataset person with age of 18 are more likely to leave the bank than a person with age of 60.
These churn rate findings can help organizations and we can solve the many problems associated with the customer leaving any organization.
In this model, we will create a very simple artificial neural network using deep learning. We will follow the following steps
- Importing the libraries & Dataset
- Encoding Categorical data
- Splitting the Dataset into Training & Test set, Feature Scaling
- Creating & Compiling ANN (Artificial Neural Network)
- Applying ANN to the training dataset.
- Predicting the outcome using ANN.
Importing the libraries & dataset
We will import all important libraries and our sample dataset, after importing the libraries we create two matrix X and Y.
After looking at a dataset and their columns, we should be smart enough to select which columns to choose for a matrix of Dependent and Independent variables. Like customer surname and customer-id won’t let any customer leave the bank so we will exclude this from X(Independent).
Matrix of feature X will have independent variables and Y will have an outcome whether a person leaves a bank or not that is column EXITED.
Encoding Categorical data
Before splitting our Data into training and test set we must do encoding of categorical variables. A computer doesn’t understand names, it understands binary 0,1.
For example- Customer’s gender is male/female(As per our dataset) so first, we will encode it to 0 and 1. Similarly, we will encode cities 0,1 and we will use LabelEncoder and OneHotEncoder by sklearn.
Splitting the Dataset into Training & Test set, Feature Scaling
For splitting the dataset we will use train_test_split by sklear or scikit learn. We will train our ANN in training set data then we will do prediction with test set result. Feature scaling is very important to match variables on the same scale.
Creating ANN (Artificial Neural Network)
For creating Deep learning ANN, we will first create an input layer, hidden layer and output layer
For creating all these layers we will use the libraries which we have imported already like Keras, Keras.models and Keras.layers. Sequential and Dense functions help us to initiate and enlarge the layers.
I have created an object named as a classifier of Sequential class. In the second line, we are using an object.method that is classifier.add, where add is adding layers in our ANN using Dense function.
Input_dim is a notation for the number of input layers. We have 11 independent variables, so we are using input_dim as 11.
Hidden layers are 6 in our ANN, 11(Independent Variable)+1(Dependent variable)/2 and activation functions for hidden and output layer is RelU, Sigmoid. If we have more categories in our outcome then we use Softmax Function.
In the last line, we have compiled our Neural Network. The optimizer is, which algorithm you want to use for providing an appropriate amount of weight in the neural network so that is stochastic gradient(Adam is one of them). There are many loss functions available but when we have an outcome as binary then we use binary_crossentropy.
Applying ANN to Training dataset
The fit method is used for applying ANN to the dataset. X_train, Y_train is our training dataset which we have created in the first part.
There are two ways to update the weight either after each observation or update a weight only after a batch of observation. Randomly we have selected batch size as 10.
An epoch is several rounds when the whole training set passed through the ANN.
Predicting the outcome using ANN
We ran out epoch for 100 times and after 100th run accuracy of our model, for the training set is 85%. However, we can easily improve the accuracy by tuning other parameters which I have discussed in another article.
We will use a method called predict and we will create y_pred, It will be a prediction of our X_test dataset. Once we have y_pred then we can compare it with an actual result that is Y_test.
So our prediction y_pred has been created and it will give the probability of which customers will leave the bank. We have compared our outcome with the actual result, and we can see customer 1 and 5 have left the bank but as per our prediction customer 5 has 82% of probability for leaving the bank.
We will create a confusion matrix which will evaluate our performance on the test dataset. Confusion matrix understands only 0,1 so we will convert our prediction into 0,1(False, True). I have taken a threshold of above 50% as 1(Customer leaves the bank) and below 50% as 0 (Stays with the bank).
Confusion metrics help us to evaluate the number of correct prediction and number of an incorrect prediction.
Amazing !! our model has predicted 1547+136=1683 correct prediction and 317 incorrect predictions out of 2000 customers. However, parameter tuning can improve the accuracy but we haven’t done here.
Based on our prediction bank can filter out the customers with a high probability of leaving the bank and customers with a low probability of leaving the bank.
Customer churn has a significant impact on your business as it lowers revenues and profits. And you have given them a list of potential customers those who are most likely to leave the organization.
Churn rate findings give us an insight into what is the likelihood, indicators of active customers discontinuing their services, What strategies can be adopted to improve.
— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —
— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —