IDEAS home Printed from https://ideas.repec.org/p/aob/wpaper/21.html
   My bibliography  Save this paper

Анализ рисков потребительских кредитов с помощью алгоритмов машинного обучения // Consumer credit risk analysis via machine learning algorithms

Author

Listed:
  • Байкулаков Шалкар // Baikulakov Shalkar

    (Center for the Development of Payment and Financial Technologies)

  • Белгибаев Зангар // Belgibayev Zanggar

    (National Bank of Kazakhstan)

Abstract

Данное исследование представляет собой попытку оценки кредитоспособности физических лиц с помощью алгоритмов машинного обучения на основе данных, предоставляемых банками второго уровня Национальному Банку Республики Казахстан. Оценка кредитоспособности заемщиков позволяет НБРК исследовать качество выданных кредитов банками второго уровня и прогнозировать потенциальные системные риски. В данном исследовании были применены два линейных и шесть нелинейных методов классификации (линейные модели - логистическая регрессия, стохастический градиентный спуск, и нелинейные - нейронные сети, k-ближайшие соседи (kNN), дерево решений (decision tree), случайный лес (random tree), XGBoost, наивный Байесовский классификатор (Naïve Bayes)) и сравнивались алгоритмы, основанные на правильности классификации (accuracy), точности (precision) и ряде других показателей. Нелинейные модели показывают более точные прогнозы по сравнению с линейными моделями. В частности, нелинейные модели, такие как случайный лес (random forest) и k-ближайшие соседи (kNN) на передискредитированных данных (oversampled data) продемонстрировали наиболее многообещающие результаты. // This project is an attempt to assess the creditworthiness of individuals through machine learning algorithms and based on regulatory data provided by second-tier banks to the central bank. The assessment of the creditworthiness of borrowers can allow the central bank to investigate the accuracy of issued loans by second-tier banks, and predict potential systematic risks. In this project, two linear and six nonlinear classification methods were developed (linear models – Logistic Regression, Stochastic Gradient Descent, and nonlinear - Neural Networks, kNN, Decision tree, Random forest, XGBoost, Naïve Bayes), and the algorithms were compared based on accuracy, precision, and several other metrics. The non-linear models illustrate more accurate predictions in comparison with the linear models. In particular, the non-linear models such as the Random Forest and kNN classifiers on oversampled data demonstrated promising outcomes.

Suggested Citation

  • Байкулаков Шалкар // Baikulakov Shalkar & Белгибаев Зангар // Belgibayev Zanggar, 2021. "Анализ рисков потребительских кредитов с помощью алгоритмов машинного обучения // Consumer credit risk analysis via machine learning algorithms," Working Papers #2021-4, National Bank of Kazakhstan.
  • Handle: RePEc:aob:wpaper:21
    as

    Download full text from publisher

    File URL: https://nationalbank.kz/file/download/68411
    File Function: Russian language version
    Download Restriction: no

    File URL: https://nationalbank.kz/file/download/68412
    File Function: English language version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Lkhagvadorj Munkhdalai & Tsendsuren Munkhdalai & Oyun-Erdene Namsrai & Jong Yun Lee & Keun Ho Ryu, 2019. "An Empirical Comparison of Machine-Learning Methods on Bank Client Credit Assessments," Sustainability, MDPI, vol. 11(3), pages 1-23, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xin Xu & Feng Xiong & Zhe An, 2023. "Using Machine Learning to Predict Corporate Fraud: Evidence Based on the GONE Framework," Journal of Business Ethics, Springer, vol. 186(1), pages 137-158, August.
    2. Oguz Koc & Omur Ugur & A. Sevtap Kestel, 2023. "The Impact of Feature Selection and Transformation on Machine Learning Methods in Determining the Credit Scoring," Papers 2303.05427, arXiv.org.
    3. Ivan Tikshaev & Roman Kulshin & Gennadii Volokitin & Pavel Senchenko & Anatoly Sidorov, 2022. "The Possibilities of Using Scoring to Determine the Relevance of Software Development Tenders," Mathematics, MDPI, vol. 10(24), pages 1-13, December.
    4. Pejman Peykani & Mostafa Sargolzaei & Mohammad Hashem Botshekan & Camelia Oprean-Stan & Amir Takaloo, 2023. "Optimization of Asset and Liability Management of Banks with Minimum Possible Changes," Mathematics, MDPI, vol. 11(12), pages 1-24, June.
    5. Raad Khraishi & Ramin Okhrati, 2022. "Offline Deep Reinforcement Learning for Dynamic Pricing of Consumer Credit," Papers 2203.03003, arXiv.org.
    6. Dmytro Krukovets, 2020. "Data Science Opportunities at Central Banks: Overview," Visnyk of the National Bank of Ukraine, National Bank of Ukraine, issue 249, pages 13-24.
    7. Juan Laborda & Seyong Ryoo, 2021. "Feature Selection in a Credit Scoring Model," Mathematics, MDPI, vol. 9(7), pages 1-22, March.
    8. Anil Kumar & Suneel Sharma & Mehregan Mahdavi, 2021. "Machine Learning (ML) Technologies for Digital Credit Scoring in Rural Finance: A Literature Review," Risks, MDPI, vol. 9(11), pages 1-15, October.
    9. Sunghyon Kyeong & Daehee Kim & Jinho Shin, 2021. "Can System Log Data Enhance the Performance of Credit Scoring?—Evidence from an Internet Bank in Korea," Sustainability, MDPI, vol. 14(1), pages 1-12, December.
    10. Victor Flores & Brian Keith, 2019. "Gradient Boosted Trees Predictive Models for Surface Roughness in High-Speed Milling in the Steel and Aluminum Metalworking Industry," Complexity, Hindawi, vol. 2019, pages 1-15, July.
    11. Guoquan Zhang & Guohao Li & Jing Peng, 2020. "Risk Assessment and Monitoring of Green Logistics for Fresh Produce Based on a Support Vector Machine," Sustainability, MDPI, vol. 12(18), pages 1-20, September.

    More about this item

    Keywords

    потребительские кредиты; машинное обучение; банковское регулирование; стохастический градиентный спуск; логистическая регрессия; k-ближайшие соседи; классификатор случайных лесов; дерево решений; gaussian NB (Гауссовский наивный Байесовский классификатор); XGBoost; нейронные сети (многослойный персептрон); consumer credits; machine learning; bank regulation; stochastic gradient descent (linear model); logistic regression (linear model); kNN (neighbors); random forest classifier (ensemble); decision tree (tree); gaussian NB (naïve bayes); XGBoost; Neural network (MLP classifier);
    All these keywords.

    JEL classification:

    • G21 - Financial Economics - - Financial Institutions and Services - - - Banks; Other Depository Institutions; Micro Finance Institutions; Mortgages
    • G28 - Financial Economics - - Financial Institutions and Services - - - Government Policy and Regulation
    • E37 - Macroeconomics and Monetary Economics - - Prices, Business Fluctuations, and Cycles - - - Forecasting and Simulation: Models and Applications
    • E51 - Macroeconomics and Monetary Economics - - Monetary Policy, Central Banking, and the Supply of Money and Credit - - - Money Supply; Credit; Money Multipliers

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:aob:wpaper:21. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Saida Agambayeva (email available below). General contact details of provider: https://edirc.repec.org/data/nbkgvkz.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.