Evaluating Local Language Models: An Application to Bank Earnings Calls

Evaluating Local Language Models: An Application to Bank Earnings Calls

Author

Listed:

Thomas R. Cook
Sophia Kazinnik
Anne Lundgaard Hansen
Peter McAdam

Abstract

This study evaluates the performance of local large language models (LLMs) in interpreting financial texts, compared with closed-source, cloud-based models. We first introduce new benchmarking tasks for assessing LLM performance in analyzing financial and economic texts and explore the refinements needed to improve its performance. Our benchmarking results suggest local LLMs are a viable tool for general natural language processing analysis of these texts. We then leverage local LLMs to analyze the tone and substance of bank earnings calls in the post-pandemic era, including calls conducted during the banking stress of early 2023. We analyze remarks in bank earnings calls in terms of topics discussed, overall sentiment, temporal orientation, and vagueness. We find that after the banking stress in early 2023, banks tended to converge to a similar set of topics for discussion and to espouse a distinctly less positive sentiment.

Suggested Citation

Thomas R. Cook & Sophia Kazinnik & Anne Lundgaard Hansen & Peter McAdam, 2023. "Evaluating Local Language Models: An Application to Bank Earnings Calls," Research Working Paper RWP 23-12, Federal Reserve Bank of Kansas City.

Handle: RePEc:fip:fedkrw:97255

Download full text from publisher

References listed on IDEAS

Alejandro Lopez-Lira & Yuehua Tang, 2023. "Can ChatGPT Forecast Stock Price Movements? Return Predictability and Large Language Models," Papers 2304.07619, arXiv.org, revised Oct 2025.
Chris Chatfield, 1995. "Model Uncertainty, Data Mining and Statistical Inference," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 158(3), pages 419-444, May.
In-Koo Cho & David M. Kreps, 1987. "Signaling Games and Stable Equilibria," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 102(2), pages 179-221.
- In-Koo Cho & David M. Kreps, 1997. "Signaling Games and Stable Equilibria," Levine's Working Paper Archive 896, David K. Levine.
De Amicis, Chiara & Falconieri, Sonia & Tastan, Mesut, 2021. "Sentiment analysis and gender differences in earnings conference calls," Journal of Corporate Finance, Elsevier, vol. 71(C).
Pekka Malo & Ankur Sinha & Pekka Korhonen & Jyrki Wallenius & Pyry Takala, 2014. "Good debt or bad debt: Detecting semantic orientations in economic texts," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 65(4), pages 782-796, April.
- Pekka Malo & Ankur Sinha & Pyry Takala & Pekka Korhonen & Jyrki Wallenius, 2013. "Good Debt or Bad Debt: Detecting Semantic Orientations in Economic Texts," Papers 1307.5336, arXiv.org, revised Jul 2013.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Christian Fieberg & Lars Hornuf & Maximilian Meiler & David J. Streich, 2025. "Using Large Language Models for Financial Advice," CESifo Working Paper Series 11666, CESifo.
George Fatouros & Kostas Metaxas & John Soldatos & Manos Karathanassis, 2025. "MarketSenseAI 2.0: Enhancing Stock Analysis through LLM Agents," Papers 2502.00415, arXiv.org, revised Oct 2025.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Julian Junyan Wang & Victor Xiaoqi Wang, 2025. "Assessing Consistency and Reproducibility in the Outputs of Large Language Models: Evidence Across Diverse Finance and Accounting Tasks," Papers 2503.16974, arXiv.org, revised Sep 2025.
Yuqi Nie & Yaxuan Kong & Xiaowen Dong & John M. Mulvey & H. Vincent Poor & Qingsong Wen & Stefan Zohren, 2024. "A Survey of Large Language Models for Financial Applications: Progress, Prospects and Challenges," Papers 2406.11903, arXiv.org.
Baptiste Lefort & Eric Benhamou & Jean-Jacques Ohana & David Saltiel & Beatrice Guez, 2024. "Optimizing Performance: How Compact Models Match or Exceed GPT's Classification Capabilities through Fine-Tuning," Papers 2409.11408, arXiv.org.
Claudia García-García & Catalina B. García-García & Román Salmerón, 2021. "Confronting collinearity in environmental regression models: evidence from world data," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 30(3), pages 895-926, September.
Andrés Perea & Elias Tsakas, 2019. "Limited focus in dynamic games," International Journal of Game Theory, Springer;Game Theory Society, vol. 48(2), pages 571-607, June.
Anders Gustafsson, 2019. "Busy doing nothing: why politicians implement inefficient policies," Constitutional Political Economy, Springer, vol. 30(3), pages 282-299, September.
- Gustafsson, Anders, 2019. "Busy Doing Nothing – Why Politicians Implement Ineffcient Policies," Ratio Working Papers 321, The Ratio Institute.
Mario Gilli & Yuan Li, 2014. "Accountability in One-Party Government: Rethinking the Success of Chinese Economic Reform," Journal of Institutional and Theoretical Economics (JITE), Mohr Siebeck, Tübingen, vol. 170(4), pages 616-645, December.
Thomas de Haan & Theo Offerman & Randolph Sloof, 2015. "Money Talks? An Experimental Investigation Of Cheap Talk And Burned Money," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 56(4), pages 1385-1426, November.
- Thomas de Haan & Theo Offerman & Randolph Sloof, 2011. "Money talks? An Experimental Investigation of Cheap Talk and Burned Money," Tinbergen Institute Discussion Papers 11-069/1, Tinbergen Institute.
Espínola-Arredondo, Ana & Muñoz-García, Félix, 2013. "When does environmental regulation facilitate entry-deterring practices," Journal of Environmental Economics and Management, Elsevier, vol. 65(1), pages 133-152.
Eduardo Perez & Delphine Prady, 2012. "Complicating to Persuade?," Working Papers hal-03583827, HAL.
- Eduardo Perez & Delphine Prady, 2012. "Complicating to Persuade?," SciencePo Working papers hal-03583827, HAL.
- Eduardo Perez-Richet & Delphine Prady, 2012. "Complicating to Persuade?," Working Papers hal-00675135, HAL.
- Eduardo Perez & Delphine Prady, 2012. "Complicating to Persuade?," Sciences Po publications info:hdl:2441/5mao0mthj59, Sciences Po.
Chou, Ping & Chuang, Howard Hao-Chun & Chou, Yen-Chun & Liang, Ting-Peng, 2022. "Predictive analytics for customer repurchase: Interdisciplinary integration of buy till you die modeling and machine learning," European Journal of Operational Research, Elsevier, vol. 296(2), pages 635-651.
Fabrizio Adriani & Giancarlo Marini & Pasquale Scaramozzino, 2009. "The Inflationary Consequences of a Currency Changeover on the Catering Sector: Evidence from the Michelin Red Guide," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 71(1), pages 111-133, February.
- Fabrizio Adriani & Giancarlo Marini & Pasquale Scaramozzino, 2008. "The Inflationary Consequences of a Currency Changeover on the Catering Sector: Evidence from the Michelin Red Guide," Bristol Economics Discussion Papers 08/604, School of Economics, University of Bristol, UK.
Pim Heijnen, 2013. "Informative advertising by an environmental group," Journal of Economics, Springer, vol. 108(3), pages 249-272, April.
- Heijnen, P., 2007. "Informative advertising by an environmental group," CeNDEF Working Papers 07-02, Universiteit van Amsterdam, Center for Nonlinear Dynamics in Economics and Finance.
Dutta, Bhaskar & Vohra, Rajiv, 2005. "Incomplete information, credibility and the core," Mathematical Social Sciences, Elsevier, vol. 50(2), pages 148-165, September.
- Bhaskar Dutta & Rajiv Vohra, 2001. "Incomplete Information, Credibility and the Core," Working Papers 2001-02, Brown University, Department of Economics.
- Rajiv Vohra & Bhaskar Dutta, 2003. "Incomplete Information, Credibility and the Core," Working Papers 2003-21, Brown University, Department of Economics.
Riccardo (Jack) Lucchetti & Luca Pedini, 2020. "ParMA: Parallelised Bayesian Model Averaging for Generalised Linear Models," Working Papers 2020:28, Department of Economics, University of Venice "Ca' Foscari".
Arnold, M., 2017. "The impact of central clearing on banks’ lending discipline," Journal of Financial Markets, Elsevier, vol. 36(C), pages 91-114.
Takaaki Hamada, 2020. "Implications of the Tradeoff between Inside and Outside Social Status in Group Choice," Papers 2008.10145, arXiv.org.
Vladimirov, Vladimir, 2015. "Financing bidders in takeover contests," Journal of Financial Economics, Elsevier, vol. 117(3), pages 534-557.
repec:hum:wpaper:sfb649dp2014-041 is not listed on IDEAS
Miguel Ángel Ropero, 2021. "Entry deterrence when the potential entrant is your competitor in a different market," Southern Economic Journal, John Wiley & Sons, vol. 87(3), pages 1010-1030, January.
Anton Bondarev & Beat Hintermann & Frank C. Krysiak & Ralph Winkler, 2017. "The Intricacy of Adapting to Climate Change: Flood Protection as a Local Public Goods Game," CESifo Working Paper Series 6382, CESifo.

More about this item

Keywords

; ; ; ;

JEL classification:

C45 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods: Special Topics - - - Neural Networks and Related Topics
G21 - Financial Economics - - Financial Institutions and Services - - - Banks; Other Depository Institutions; Micro Finance Institutions; Mortgages

NEP fields

This paper has been announced in the following NEP Reports:

NEP-AIN-2023-12-04 (Artificial Intelligence)
NEP-BAN-2023-12-04 (Banking)
NEP-BIG-2023-12-04 (Big Data)
NEP-CMP-2023-12-04 (Computational Economics)

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:fip:fedkrw:97255. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Kira Lillard (email available below). General contact details of provider: https://edirc.repec.org/data/frbkcus.html .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Evaluating Local Language Models: An Application to Bank Earnings Calls

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Keywords

JEL classification:

NEP fields

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data