Suppose you work for a large company with millions of active clients. Quite often the management would ask: how many customers do we lose? Even if it is 2% of our whole customer base per month, it may look terrible for a large company like a bank. In case of 5 million active clients it is like 100 thousand clients. Imagine we lose the population size of a city each month.
In this article I would like to share my experience of forecasting and managing churn in retail banking with the help of machine learning algorithms. I hope that my experience may be useful for those trying to deal with churn in their work.
Disclaimer: my experience involved the machine learning approach although there are very effective common sense-based tools for dealing with churn. They may be considered as good compliments and sometimes even substitutes for the methods discussed in this article.
Churn management via predicting a share of clients highly inclined to leave
Several times in my career I have been given a task to identify client groups highly inclined to churn with the help of machine learning algorithms. The next step was supposed to be stopping those clients from leaving. So, the setup looks like this:
So, first we build a model to predict churn (collect the data -> build predicting model -> test its quality). According to my experience it is possible to build a model with ROC-AUC around 75% (or 50% gini) for such a task. The distribution of our clients by their probability to churn may look similar to the one in the following chart:
Although we want to see clients with probability like 90%+ it typically does not happen in practice. And even if we have some clients with that high probability to churn then:
- Either they have already left or almost left and it is too late to bring them back
- Or there are so few of them that it is not reasonable to do anything to stop them
But ok, what if we have at least 1% of our clients with 30% probability to churn (which is pretty high in comparison with the average churn rate of 2%). Unfortunately, my experience suggests that it is hard to stop them from leaving…
Here is why: the forecast and the model itself says nothing about the reasons why clients leave. There are ML techniques to identify the main drivers of predictions (e.g., local/global interpretation methods), but anyway they typically show that the main factors predicting churn are something like decaying product usage. Thus, it is very hard to derive any real reasons for losing our clients from machine learning models.
According to my experience, such an approach works well on very rare occasions. So, there are other things we need to do.
Tool driven modeling
Here is another approach based on predicting churn with machine learning algorithms that performed far better in the bank where I worked several years ago.
The framework works like this: we start by looking for reasons why clients leave, then we try to find appropriate tools to deal with those reasons and then we finally use machine learning as a helper to find clients sensitive to the proposed tools.
Let us get into more detail:
- Formulate a hypothesis (for example, credit card interest rate is too high for some clients, and they look for better options)
- Do some research and look for data proof to support the hypothesis (for example, compare different tariffs for similar products and similar client profiles)
- Come up with a tool to stop clients from leaving (for example, setting a better interest rate)
- Apply the tool for a group of clients, a test group (with proper AB-testing)
- Observe the clients’ behavior in the test group (in terms of churn)
- Having their behavior data, we can build an uplift model identifying clients sensitive to an interest rate decrease
Uplift model in our case predicts Prob (churn | better interest rate) - Prob (churn | no interest rate change). There are a variety of techniques to build such a model. For example, one of the simplest ways is to have in our training data set both clients with and without interest rate change and then use interest rate change as one of the features.
- After that use the model to identify clients that are ready to change their churn behavior because of interest rate decrease
- Among those clients pick those who stay profitable for the bank even despite the interest rate decrease
- Launch a regular CRM campaign to stop such clients from leaving, based on interest rate decrease
There are more sophisticated setups with which we can try to find the right value of an interest rate decrease for each individual client. But although they require more accurate mathematical modeling it is still applied generally the same way.
This approach demonstrates far better results than the precious one. For example, in one of my previous jobs we managed to decrease the churn rate for some groups of clients by setting better interest rates for them.
Conclusion
In conclusion, first I would like to say that it is possible to predict customer churn with some decent quality using modern forecasting techniques (including machine learning algorithms). But I would challenge the task of predicting churn in the first place. We do not want to just predict it; we want to manage it. And this implies a totally different approach: finding the answer why clients leave and then the right tools to stop them.
Unfortunately, machine learning predictions are not a magic pill - they can hardly replace looking for client retaining tools. In my experience, the most valuable use of machine learning is to find the right target groups among our customers for specific tools like price discounts and use uplift modeling for them.
I would like to finish this article by one final statement, last but not the least: sometimes it is good even to challenge the task of churn management itself. Maybe it is better not to stop all clients from leaving, but to stop only good clients (for example, those who generate good profits for our business). Unprofitable clients may go to our competitors and "help" them to improve average churn rate – just let them go :)
Top comments (0)