Churn prediction in the telecommunication industry using machine learning

Retaining customers is an important challenge for the telecom industry in this day and age. The changing behavior of the users and the type of services being offered every day makes it difficult to predict the churners. Thus the service providers find it more convincing to retain an ongoing customer rather than running after new subscribers. Churn prediction helps big companies in cost savings and they can identify reasons of why the subscriber is unsatisfied with their service. Data mining techniques can help in predictive analysis and creating models that can give accurate classification of the churners and non-churners.

The most recent techniques being employed are the ensemble learning techniques, which considers using a combination of learners instead of a single classifier to increase the classification accuracy. In this thesis, we explore the use of ensemble learning techniques for customer churn prediction. We evaluate the performance, the usage of efficient features and the classification techniques on a public and a private churn data set in the telecom industry. The proposed framework is a combination of the bagging and stacking ensemble learning techniques with three base learners namely Neural Network, K-Nearest Neighbors and Decision Tree. This in turn produces a bagged-stacked Meta Decision Tree that predicts 98% of the churned customers in the UCI dataset and 90% churners in the private data set. The results reveal that the proposed framework is more efficient and accurate as compared to the state of the art and the simple ensemble techniques.