IDEAS home Printed from https://ideas.repec.org/a/gam/jdataj/v10y2025i7p96-d1685465.html
   My bibliography  Save this article

A Data Imputation Strategy to Enhance Online Game Churn Prediction, Considering Non-Login Periods

Author

Listed:
  • JaeHong Lee

    (School of Information, Computer, and Communication Technology, Sirindhorn International Institute of Technology, Thammasat University, Pathum Thani 12120, Thailand)

  • Pavinee Rerkjirattikal

    (Department of Technology and Operations Management, Faculty of Business Administration, Kasetsart University, Bangkok 10900, Thailand)

  • SangGyu Nam

    (School of Information, Computer, and Communication Technology, Sirindhorn International Institute of Technology, Thammasat University, Pathum Thani 12120, Thailand)

Abstract

User churn in online games refers to players becoming inactive for an extended period. Even a small increase in churn can lead to significant revenue loss, making churn prediction crucial for sustaining long-term player engagement. Although user churn prediction has been extensively studied, most existing approaches either ignore non-login periods or treat all inactivity uniformly, overlooking key behavioral differences. This study addresses this gap by categorizing non-login periods into three types, as follows: inactivity due to new or dormant users, genuine loss of interest, and temporary inaccessibility caused by external factors. These periods are treated as either non-existent or missing data and imputed using techniques such as mean or mode substitution, linear interpolation, and multiple imputation by chained equations (MICE). MICE was selected due to its ability to impute missing values more robustly by considering multivariate relationships. A random forest (RF) classifier, chosen for its interpretability and robustness to incomplete data, serves as the primary prediction model. Additionally, classifier chains are used to capture label dependencies, and principal component analysis (PCA) is applied to reduce dimensionality and mitigate overfitting. Experiments on real-world MMORPG data show that our approach improves predictive accuracy, achieving a micro-averaged AUC of above 0.92 and a weighted F1 score exceeding 0.70. These findings suggest that our approach improves churn prediction and offers actionable insights for supporting personalized player retention strategies.

Suggested Citation

  • JaeHong Lee & Pavinee Rerkjirattikal & SangGyu Nam, 2025. "A Data Imputation Strategy to Enhance Online Game Churn Prediction, Considering Non-Login Periods," Data, MDPI, vol. 10(7), pages 1-20, June.
  • Handle: RePEc:gam:jdataj:v:10:y:2025:i:7:p:96-:d:1685465
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2306-5729/10/7/96/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2306-5729/10/7/96/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Jiang, Ping & Liu, Zhenkun & Abedin, Mohammad Zoynul & Wang, Jianzhou & Yang, Wendong & Dong, Qingli, 2024. "Profit-driven weighted classifier with interpretable ability for customer churn prediction," Omega, Elsevier, vol. 125(C).
    2. Matthias Templ, 2023. "Enhancing Precision in Large-Scale Data Analysis: An Innovative Robust Imputation Algorithm for Managing Outliers and Missing Values," Mathematics, MDPI, vol. 11(12), pages 1-22, June.
    3. Seungwook Kim & Daeyoung Choi & Eunjung Lee & Wonjong Rhee, 2017. "Churn prediction of mobile and online casual games using play log data," PLOS ONE, Public Library of Science, vol. 12(7), pages 1-19, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xing, Qianyi & Huang, Xiaojia & Wang, Kang & Wang, Jianzhou & Wang, Shuai, 2025. "MIG-EWPFS: An ensemble probabilistic wind speed forecasting system integrating multi-dimensional feature extraction, hybrid quantile regression, and Knee improved multi-objective optimization," Energy, Elsevier, vol. 324(C).
    2. Ma, Xuejiao & Che, Tianqi & Jiang, Qichuan, 2025. "A three-stage prediction model for firm default risk: An integration of text sentiment analysis," Omega, Elsevier, vol. 131(C).
    3. Wang, Lei & Wang, Xinyu & Zhao, Zhongchao, 2024. "Mid-term electricity demand forecasting using improved multi-mode reconstruction and particle swarm-enhanced support vector regression," Energy, Elsevier, vol. 304(C).
    4. Kaan Arik & Murat Gezer & Seda Tolun Tayali, 2022. "The study of indicators affecting customer churn in MMORPG games with machine learning models," Upravlenets, Ural State University of Economics, vol. 13(6), pages 70-85, January.
    5. Liu, Zhenkun & Zhang, Ying & Abedin, Mohammad Zoynul & Wang, Jianzhou & Yang, Hufang & Gao, Yuyang & Chen, Yinghao, 2024. "Profit-driven fusion framework based on bagging and boosting classifiers for potential purchaser prediction," Journal of Retailing and Consumer Services, Elsevier, vol. 79(C).
    6. Rahman, Shimanto & Janssens, Bram & Bogaert, Matthias, 2025. "Profit-driven pre-processing in B2B customer churn modeling using fairness techniques," Journal of Business Research, Elsevier, vol. 189(C).
    7. Zhao, Yang & Wang, Jianzhou & Wang, Shuai & Zheng, Jingwei & Lv, Mengzheng, 2025. "Using explainable deep learning to improve decision quality: Evidence from carbon trading market," Omega, Elsevier, vol. 133(C).
    8. Feng, Yi & Yin, Yunqiang & Wang, Dujuan & Ignatius, Joshua & Cheng, T.C.E. & Marra, Marianna & Guo, Yihan, 2024. "Enhancing e-commerce customer churn management with a profit- and AUC-focused prescriptive analytics approach," Journal of Business Research, Elsevier, vol. 184(C).
    9. Ana Perišić & Marko Pahor, 2023. "Clustering mixed-type player behavior data for churn prediction in mobile games," Central European Journal of Operations Research, Springer;Slovak Society for Operations Research;Hungarian Operational Research Society;Czech Society for Operations Research;Österr. Gesellschaft für Operations Research (ÖGOR);Slovenian Society Informatika - Section for Operational Research;Croatian Operational Research Society, vol. 31(1), pages 165-190, March.
    10. Lifang Zhang & Mohammad Zoynul Abedin & Zhenkun Liu, 2024. "Incorporating media news to predict financial distress: Case study on Chinese listed companies," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 43(5), pages 1374-1398, August.
    11. Youngjin Seol & Seunghyun Lee & Jiho Lee & Chang-Wan Kim & Hyun Su Bak & Youngchul Byun & Janghyeok Yoon, 2024. "An Interpretable Time Series Forecasting Model for Predicting NOx Emission Concentration in Ferroalloy Electric Arc Furnace Plants," Mathematics, MDPI, vol. 12(6), pages 1-22, March.
    12. Huosong Xia & Xiaoyu Hou & Justin Zuopeng Zhang & Mohammad Zoynul Abedin, 2025. "A new probability forecasting model for cotton yarn futures price volatility with explainable AI and big data," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 44(1), pages 112-135, January.
    13. Liu, Zhenkun & De Bock, Koen W. & Zhang, Lifang, 2025. "Explainable profit-driven hotel booking cancellation prediction based on heterogeneous stacking-based ensemble classification," European Journal of Operational Research, Elsevier, vol. 321(1), pages 284-301.
    14. Gómez-Vargas, Nuria & Maldonado, Sebastián & Vairetti, Carla, 2025. "A predict-and-optimize approach to profit-driven churn prevention," European Journal of Operational Research, Elsevier, vol. 324(2), pages 555-566.
    15. Niu, Zhewen & Han, Xiaoqing & Zhang, Dongxia & Wu, Yuxiang & Lan, Songyan, 2024. "Interpretable wind power forecasting combining seasonal-trend representations learning with temporal fusion transformers architecture," Energy, Elsevier, vol. 306(C).
    16. Wang, Shuai & Wang, Qian & Lu, Helen & Zhang, Dongxue & Xing, Qianyi & Wang, Jianzhou, 2025. "Learning about tail risk: Machine learning and combination with regularization in market risk management," Omega, Elsevier, vol. 133(C).
    17. Weng, Futian & Zhu, Miao & Buckle, Mike & Hajek, Petr & Abedin, Mohammad Zoynul, 2025. "Class imbalance Bayesian model averaging for consumer loan default prediction: The role of soft credit information," Research in International Business and Finance, Elsevier, vol. 74(C).

    More about this item

    Keywords

    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jdataj:v:10:y:2025:i:7:p:96-:d:1685465. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.