IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2105.07671.html
   My bibliography  Save this paper

Classifying variety of customer's online engagement for churn prediction with mixed-penalty logistic regression

Author

Listed:
  • Petra Posedel v{S}imovi'c
  • Davor Horvatic
  • Edward W. Sun

Abstract

Using big data to analyze consumer behavior can provide effective decision-making tools for preventing customer attrition (churn) in customer relationship management (CRM). Focusing on a CRM dataset with several different categories of factors that impact customer heterogeneity (i.e., usage of self-care service channels, duration of service, and responsiveness to marketing actions), we provide new predictive analytics of customer churn rate based on a machine learning method that enhances the classification of logistic regression by adding a mixed penalty term. The proposed penalized logistic regression can prevent overfitting when dealing with big data and minimize the loss function when balancing the cost from the median (absolute value) and mean (squared value) regularization. We show the analytical properties of the proposed method and its computational advantage in this research. In addition, we investigate the performance of the proposed method with a CRM data set (that has a large number of features) under different settings by efficiently eliminating the disturbance of (1) least important features and (2) sensitivity from the minority (churn) class. Our empirical results confirm the expected performance of the proposed method in full compliance with the common classification criteria (i.e., accuracy, precision, and recall) for evaluating machine learning methods.

Suggested Citation

  • Petra Posedel v{S}imovi'c & Davor Horvatic & Edward W. Sun, 2021. "Classifying variety of customer's online engagement for churn prediction with mixed-penalty logistic regression," Papers 2105.07671, arXiv.org, revised Jul 2021.
  • Handle: RePEc:arx:papers:2105.07671
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2105.07671
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Scott, Stephanie & Hughes, Paul & Hodgkinson, Ian & Kraus, Sascha, 2019. "Technology adoption factors in the digitization of popular culture: Analyzing the online gambling market," Technological Forecasting and Social Change, Elsevier, vol. 148(C).
    2. P. Tseng, 2001. "Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization," Journal of Optimization Theory and Applications, Springer, vol. 109(3), pages 475-494, June.
    3. Zeineb Affes & Rania Hentati-Kaffel, 2016. "Predicting US banks bankruptcy: logit versus Canonical Discriminant analysis," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) halshs-01281948, HAL.
    4. Arno de Caigny & Kristof Coussement & Koen W. de Bock, 2018. "A new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees," Post-Print hal-01741661, HAL.
    5. Zeineb Affes & Rania Hentati-Kaffel, 2019. "Predicting US Banks Bankruptcy: Logit Versus Canonical Discriminant Analysis," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) hal-03045837, HAL.
    6. Hui Zou & Trevor Hastie, 2005. "Addendum: Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(5), pages 768-768, November.
    7. Amin, Adnan & Al-Obeidat, Feras & Shah, Babar & Adnan, Awais & Loo, Jonathan & Anwar, Sajid, 2019. "Customer churn prediction in telecommunication industry using data certainty," Journal of Business Research, Elsevier, vol. 94(C), pages 290-301.
    8. De Caigny, Arno & Coussement, Kristof & De Bock, Koen W., 2018. "A new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees," European Journal of Operational Research, Elsevier, vol. 269(2), pages 760-772.
    9. Coussement, Kristof & De Bock, Koen W., 2013. "Customer churn prediction in the online gambling industry: The beneficial effect of ensemble learning," Journal of Business Research, Elsevier, vol. 66(9), pages 1629-1636.
    10. Zeineb Affes & Rania Hentati-Kaffel, 2019. "Predicting US Banks Bankruptcy: Logit Versus Canonical Discriminant Analysis," Post-Print hal-03045837, HAL.
    11. Konietzny, Jirka & Caruana, Albert & Cassar, Mario L., 2018. "Fun and fair, and I don’t care: The role of enjoyment, fairness and subjective norms on online gambling intentions," Journal of Retailing and Consumer Services, Elsevier, vol. 44(C), pages 91-99.
    12. Hui Zou & Trevor Hastie, 2005. "Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 301-320, April.
    13. Yan Zhang & Peter Trubey, 2019. "Machine Learning and Sampling Scheme: An Empirical Study of Money Laundering Detection," Computational Economics, Springer;Society for Computational Economics, vol. 54(3), pages 1043-1063, October.
    14. Hing, Nerilee & Lamont, Matthew & Vitartas, Peter & Fink, Elian, 2015. "Sports bettors' responses to sports-embedded gambling promotions: Implications for compulsive consumption," Journal of Business Research, Elsevier, vol. 68(10), pages 2057-2066.
    15. Zeineb Affes & Rania Hentati-Kaffel, 2019. "Predicting US Banks Bankruptcy: Logit Versus Canonical Discriminant Analysis," Computational Economics, Springer;Society for Computational Economics, vol. 54(1), pages 199-244, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Petra P. Šimović & Claire Y. T. Chen & Edward W. Sun, 2023. "Classifying the Variety of Customers’ Online Engagement for Churn Prediction with a Mixed-Penalty Logistic Regression," Computational Economics, Springer;Society for Computational Economics, vol. 61(1), pages 451-485, January.
    2. Youssef Zizi & Mohamed Oudgou & Abdeslam El Moudden, 2020. "Determinants and Predictors of SMEs’ Financial Failure: A Logistic Regression Approach," Risks, MDPI, vol. 8(4), pages 1-21, October.
    3. Matthias Bogaert & Lex Delaere, 2023. "Ensemble Methods in Customer Churn Prediction: A Comparative Analysis of the State-of-the-Art," Mathematics, MDPI, vol. 11(5), pages 1-28, February.
    4. Chen, Yan & Zhang, Lei & Zhao, Yulu & Xu, Bing, 2022. "Implementation of penalized survival models in churn prediction of vehicle insurance," Journal of Business Research, Elsevier, vol. 153(C), pages 162-171.
    5. Elena Gregova & Katarina Valaskova & Peter Adamko & Milos Tumpach & Jaroslav Jaros, 2020. "Predicting Financial Distress of Slovak Enterprises: Comparison of Selected Traditional and Learning Algorithms Methods," Sustainability, MDPI, vol. 12(10), pages 1-17, May.
    6. Dumitrescu, Elena & Hué, Sullivan & Hurlin, Christophe & Tokpavi, Sessi, 2022. "Machine learning for credit scoring: Improving logistic regression with non-linear decision-tree effects," European Journal of Operational Research, Elsevier, vol. 297(3), pages 1178-1192.
    7. O. Vasiurenko & V. LYASHENKO, 2020. "Wavelet coherence as a tool for retrospective analysis of bank activities," Economy and Forecasting, Valeriy Heyets, issue 2, pages 43-60.
    8. Elena Ivona DUMITRESCU & Sullivan HUE & Christophe HURLIN & Sessi TOKPAVI, 2020. "Machine Learning or Econometrics for Credit Scoring: Let’s Get the Best of Both Worlds," LEO Working Papers / DR LEO 2839, Orleans Economics Laboratory / Laboratoire d'Economie d'Orleans (LEO), University of Orleans.
    9. Youssef Zizi & Amine Jamali-Alaoui & Badreddine El Goumi & Mohamed Oudgou & Abdeslam El Moudden, 2021. "An Optimal Model of Financial Distress Prediction: A Comparative Study between Neural Networks and Logistic Regression," Risks, MDPI, vol. 9(11), pages 1-24, November.
    10. Magdalena Brygała, 2022. "Consumer Bankruptcy Prediction Using Balanced and Imbalanced Data," Risks, MDPI, vol. 10(2), pages 1-13, January.
    11. Chou, Ping & Chuang, Howard Hao-Chun & Chou, Yen-Chun & Liang, Ting-Peng, 2022. "Predictive analytics for customer repurchase: Interdisciplinary integration of buy till you die modeling and machine learning," European Journal of Operational Research, Elsevier, vol. 296(2), pages 635-651.
    12. Jung, Yoon Mo & Whang, Joyce Jiyoung & Yun, Sangwoon, 2020. "Sparse probabilistic K-means," Applied Mathematics and Computation, Elsevier, vol. 382(C).
    13. Koen W. de Bock & Arno de Caigny, 2021. "Spline-rule ensemble classifiers with structured sparsity regularization for interpretable customer churn modeling," Post-Print hal-03391564, HAL.
    14. Louis Geiler & Séverine Affeldt & Mohamed Nadif, 2022. "A survey on machine learning methods for churn prediction," Post-Print hal-03824873, HAL.
    15. Nicholson, William B. & Matteson, David S. & Bien, Jacob, 2017. "VARX-L: Structured regularization for large vector autoregressions with exogenous variables," International Journal of Forecasting, Elsevier, vol. 33(3), pages 627-651.
    16. David Degras, 2021. "Sparse group fused lasso for model segmentation: a hybrid approach," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 15(3), pages 625-671, September.
    17. Zanin, Luca, 2020. "Combining multiple probability predictions in the presence of class imbalance to discriminate between potential bad and good borrowers in the peer-to-peer lending market," Journal of Behavioral and Experimental Finance, Elsevier, vol. 25(C).
    18. Gattermann-Itschert, Theresa & Thonemann, Ulrich W., 2021. "How training on multiple time slices improves performance in churn prediction," European Journal of Operational Research, Elsevier, vol. 295(2), pages 664-674.
    19. Lewlisa Saha & Hrudaya Kumar Tripathy & Tarek Gaber & Hatem El-Gohary & El-Sayed M. El-kenawy, 2023. "Deep Churn Prediction Method for Telecommunication Industry," Sustainability, MDPI, vol. 15(5), pages 1-21, March.
    20. Yanming Li & Bin Nan & Ji Zhu, 2015. "Multivariate sparse group lasso for the multivariate multiple linear regression with an arbitrary group structure," Biometrics, The International Biometric Society, vol. 71(2), pages 354-363, June.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2105.07671. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.