IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2305.16633.html
   My bibliography  Save this paper

Zero is Not Hero Yet: Benchmarking Zero-Shot Performance of LLMs for Financial Tasks

Author

Listed:
  • Agam Shah
  • Sudheer Chava

Abstract

Recently large language models (LLMs) like ChatGPT have shown impressive performance on many natural language processing tasks with zero-shot. In this paper, we investigate the effectiveness of zero-shot LLMs in the financial domain. We compare the performance of ChatGPT along with some open-source generative LLMs in zero-shot mode with RoBERTa fine-tuned on annotated data. We address three inter-related research questions on data annotation, performance gaps, and the feasibility of employing generative models in the finance domain. Our findings demonstrate that ChatGPT performs well even without labeled data but fine-tuned models generally outperform it. Our research also highlights how annotating with generative models can be time-intensive. Our codebase is publicly available on GitHub under CC BY-NC 4.0 license.

Suggested Citation

  • Agam Shah & Sudheer Chava, 2023. "Zero is Not Hero Yet: Benchmarking Zero-Shot Performance of LLMs for Financial Tasks," Papers 2305.16633, arXiv.org.
  • Handle: RePEc:arx:papers:2305.16633
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2305.16633
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Stephen Hansen & Michael McMahon & Andrea Prat, 2018. "Transparency and Deliberation Within the FOMC: A Computational Linguistics Approach," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 133(2), pages 801-870.
    2. Bennani, Hamza & Fanta, Nicolas & Gertler, Pavel & Horvath, Roman, 2020. "Does central bank communication signal future monetary policy in a (post)-crisis era? The case of the ECB," Journal of International Money and Finance, Elsevier, vol. 104(C).
    3. García, Diego & Hu, Xiaowen & Rohrer, Maximilian, 2023. "The colour of finance words," Journal of Financial Economics, Elsevier, vol. 147(3), pages 525-549.
    4. Rozkrut, Marek & Rybinski, Krzysztof & Sztaba, Lucyna & Szwaja, Radoslaw, 2007. "Quest for central bank communication: Does it pay to be "talkative"?," European Journal of Political Economy, Elsevier, vol. 23(1), pages 176-206, March.
    5. Stefano Nardelli & David Martens & Ellen Tobback, 2017. "Between hawks and doves: measuring Central Bank Communication," IFC Bulletins chapters, in: Bank for International Settlements (ed.), Big Data, volume 44, Bank for International Settlements.
    6. Pekka Malo & Ankur Sinha & Pekka Korhonen & Jyrki Wallenius & Pyry Takala, 2014. "Good debt or bad debt: Detecting semantic orientations in economic texts," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 65(4), pages 782-796, April.
    7. Tobback, Ellen & Nardelli, Stefano & Martens, David, 2017. "Between hawks and doves: measuring central bank communication," Working Paper Series 2085, European Central Bank.
    8. Anna Cieslak & Adair Morse & Annette Vissing‐Jorgensen, 2019. "Stock Returns over the FOMC Cycle," Journal of Finance, American Finance Association, vol. 74(5), pages 2201-2248, October.
    9. Tim Loughran & Bill Mcdonald, 2011. "When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10‐Ks," Journal of Finance, American Finance Association, vol. 66(1), pages 35-65, February.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Agam Shah & Suvan Paturi & Sudheer Chava, 2023. "Trillion Dollar Words: A New Financial Dataset, Task & Market Analysis," Papers 2305.07972, arXiv.org.
    2. Martin Baumgaertner & Johannes Zahner, 2021. "Whatever it takes to understand a central banker - Embedding their words using neural networks," MAGKS Papers on Economics 202130, Philipps-Universität Marburg, Faculty of Business Administration and Economics, Department of Economics (Volkswirtschaftliche Abteilung).
    3. Paloviita, Maritta & Haavio, Markus & Jalasjoki, Pirkka & Kilponen, Juha & Vänni, Ilona, 2020. "Reading between the lines : Using text analysis to estimate the loss function of the ECB," Research Discussion Papers 12/2020, Bank of Finland.
    4. Parle, Conor, 2022. "The financial market impact of ECB monetary policy press conferences — A text based approach," European Journal of Political Economy, Elsevier, vol. 74(C).
    5. Baumgärtner, Martin & Zahner, Johannes, 2023. "Whatever it takes to understand a central banker: Embedding their words using neural networks," IMFS Working Paper Series 194, Goethe University Frankfurt, Institute for Monetary and Financial Stability (IMFS).
    6. repec:zbw:bofrdp:2020_012 is not listed on IDEAS
    7. Szyszko, Magdalena & Rutkowska, Aleksandra & Kliber, Agata, 2022. "Do words affect expectations? The effect of central banks communication on consumer inflation expectations," The Quarterly Review of Economics and Finance, Elsevier, vol. 86(C), pages 221-229.
    8. Kirtac, Kemal & Germano, Guido, 2024. "Sentiment trading with large language models," Finance Research Letters, Elsevier, vol. 62(PB).
    9. Johannes Zahner, 2020. "Above, but close to two percent. Evidence on the ECB’s inflation target using text mining," MAGKS Papers on Economics 202046, Philipps-Universität Marburg, Faculty of Business Administration and Economics, Department of Economics (Volkswirtschaftliche Abteilung).
    10. Picault, Matthieu & Pinter, Julien & Renault, Thomas, 2022. "Media sentiment on monetary policy: Determinants and relevance for inflation expectations," Journal of International Money and Finance, Elsevier, vol. 124(C).
    11. Thang Ngoc Doan & Dong Phu Do & Dat Van Luong, 2023. "Monetary stance and favorableness of the monetary policy in the media: the case of Vietnam," Journal of Asian Business and Economic Studies, Emerald Group Publishing Limited, vol. 31(2), pages 111-123, August.
    12. Kirtac, Kemal & Germano, Guido, 2024. "Sentiment trading with large language models," LSE Research Online Documents on Economics 122592, London School of Economics and Political Science, LSE Library.
    13. Yuriy Gorodnichenko & Tho Pham & Oleksandr Talavera, 2023. "The Voice of Monetary Policy," American Economic Review, American Economic Association, vol. 113(2), pages 548-584, February.
    14. Linas Jurkšas & Rokas Kaminskas, 2023. "ECB monetary policy communication: does it move euro area yields?," Bank of Lithuania Discussion Paper Series 29, Bank of Lithuania.
    15. Kwok Ping Tsang & Zichao Yang, 2023. "Agree to Disagree: Measuring Hidden Dissent in FOMC Meetings," Papers 2308.10131, arXiv.org, revised Nov 2024.
    16. Paweł Baranowski & Hamza Bennani & Wirginia Doryń, 2020. "Do ECB introductory statements help to predict monetary policy: evidence from tone analysis," NBP Working Papers 323, Narodowy Bank Polski.
    17. Joaquin Iglesias & Alvaro Ortiz & Tomasa Rodrigo, 2017. "How do the EM Central Bank talk? A Big Data approach to the Central Bank of Turkey," Working Papers 17/24, BBVA Bank, Economic Research Department.
    18. Donato Masciandaro & Oana Peia & Davide Romelli, 2024. "Central bank communication and social media: From silence to Twitter," Journal of Economic Surveys, Wiley Blackwell, vol. 38(2), pages 365-388, April.
    19. Valerio Astuti & Alessio Ciarlone & Alberto Coco, 2022. "The role of central bank communication in inflation-targeting Eastern European emerging economies," Temi di discussione (Economic working papers) 1381, Bank of Italy, Economic Research and International Relations Area.
    20. Fulop, Andras & Kocsis, Zalan, 2023. "News indices on country fundamentals," Journal of Banking & Finance, Elsevier, vol. 154(C).
    21. Magdalena Szyszko & Aleksandra Rutkowska, 2022. "Do words transform into actions? The consistency of central banks’ communications and decisions," Equilibrium. Quarterly Journal of Economics and Economic Policy, Institute of Economic Research, vol. 17(1), pages 31-49, March.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2305.16633. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.