IDEAS home Printed from https://ideas.repec.org/a/eee/intfor/v41y2025i3p894-919.html
   My bibliography  Save this article

Credit scoring model for fintech lending: An integration of large language models and FocalPoly loss

Author

Listed:
  • Xia, Yufei
  • Han, Zhiyin
  • Li, Yawen
  • He, Lingyun

Abstract

Fintech lending experiences high credit risk and needs an efficient credit scoring model, but it also faces limited data sources and severe class imbalance. We develop a novel two-stage credit scoring model (called LLM-FP-CatBoost) by solving the two issues simultaneously. Large language models (LLMs) initially extract narrative data as a supplementary credit dataset. A new FocalPoly loss is then incorporated with CatBoost to handle the class imbalance problem. Extensive comparisons demonstrate that the proposed LLM-FP-CatBoost significantly outperforms the benchmarks in most circumstances. When making pairwise comparisons between LLMs on the fintech lending dataset, we found that the Chinese-specific LLM, i.e., ERNIE 4.0, achieves the best overall performance, followed by GPT-4 and BERT-based models. The performance decomposition reveals that the superiority is mainly attributed to the new data source extracted by the LLMs. The SHAP algorithm further ensures the interpretability of LLM-FP-CatBoost. The superiority of the proposed LLM-FP-CatBoost model remains robust to hyperparameters of the loss function, specific LLMs, and other extraction methods of narrative data. Finally, we discuss some managerial implications concerning credit scoring in fintech lending.

Suggested Citation

  • Xia, Yufei & Han, Zhiyin & Li, Yawen & He, Lingyun, 2025. "Credit scoring model for fintech lending: An integration of large language models and FocalPoly loss," International Journal of Forecasting, Elsevier, vol. 41(3), pages 894-919.
  • Handle: RePEc:eee:intfor:v:41:y:2025:i:3:p:894-919
    DOI: 10.1016/j.ijforecast.2024.07.005
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0169207024000724
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ijforecast.2024.07.005?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:intfor:v:41:y:2025:i:3:p:894-919. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/ijforecast .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.