top of page
Search

Cell2Cell Churn Modeling

  • Writer: Evie Wei
    Evie Wei
  • Mar 11, 2021
  • 1 min read

In the project, my team developed a model to predict customer churn for Cell2Cell - a telecommunications service provider. The collected data is outdated, considering that the telecommunication market was reaching a saturation point and competition was rapidly increasing. It is of utmost importance to develop a model to identify the most important drivers of churn.


Let me give you a brief introduction of our methodology:


Dataset -

  • 71,047 observations and 78 variables.

  • Calibration” data: 40,000 customers (Calibrate = 1)

  • Validation” data: 31,047 customers (Calibrate = 0)


Data Cleaning and Feature Selection (With Python/SQL) :

  • Imputed missing values of continuous variables with the calculated mean of a specific column. (Worked better than KNN Imputation)

  • Removed variables with high correlation (>70%)

  • Removed “csa” due to high number of levels (n=744) and customer ID (Worked better than recording area codes to “US state” dummies)

  • Employed SelectKBest feature selection method which calculates a said score for each of the variables depending on how they influence the target variable. By default considers the ANOVA F scores and ranks the variables based on the scores.


We selected the Top 10 important variables for the following modeling.


Data Modeling (With R/Machine Learning)

  • Applied 4 machine learning algorithms - Random, Logistic Regression, Random Forest, Gradient Boosting to find out the most suitable one

  • Estimated models by Height of the lift curve and AUC



After finding the best model, we divided records into three-level churn groups and calculated their CLV to the company, which gave us a much clearer answer that customers with lower churn rate would bring higher values.



Here is an explanation of valuables that have a positive effect on customer churn.




With these new insights, we recommended a pro-active customer churn management program.


 
 
 

1 Komentar


Cosette Romero
Cosette Romero
02 Jul 2024

One question, can you share the parameters that you used in the Random Forest and in the Gradient Boosting? btw Incredible Job!

Suka
Post: Blog2_Post

©2023 All rights reserved. Created with Wix.com

bottom of page