IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2603.10807.html

Risk-Adjusted Harm Scoring for Automated Red Teaming for LLMs in Financial Services

Author

Listed:
  • Fabrizio Dimino
  • Bhaskarjit Sarmah
  • Stefano Pasquali

Abstract

The rapid adoption of large language models (LLMs) in financial services introduces new operational, regulatory, and security risks. Yet most red-teaming benchmarks remain domain-agnostic and fail to capture failure modes specific to regulated BFSI settings, where harmful behavior can be elicited through legally or professionally plausible framing. We propose a risk-aware evaluation framework for LLM security failures in Banking, Financial Services, and Insurance (BFSI), combining a domain-specific taxonomy of financial harms, an automated multi-round red-teaming pipeline, and an ensemble-based judging protocol. We introduce the Risk-Adjusted Harm Score (RAHS), a risk-sensitive metric that goes beyond success rates by quantifying the operational severity of disclosures, accounting for mitigation signals, and leveraging inter-judge agreement. Across diverse models, we find that higher decoding stochasticity and sustained adaptive interaction not only increase jailbreak success, but also drive systematic escalation toward more severe and operationally actionable financial disclosures. These results expose limitations of single-turn, domain-agnostic security evaluation and motivate risk-sensitive assessment under prolonged adversarial pressure for real-world BFSI deployment.

Suggested Citation

  • Fabrizio Dimino & Bhaskarjit Sarmah & Stefano Pasquali, 2026. "Risk-Adjusted Harm Scoring for Automated Red Teaming for LLMs in Financial Services," Papers 2603.10807, arXiv.org.
  • Handle: RePEc:arx:papers:2603.10807
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2603.10807
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Yinheng Li & Shaofei Wang & Han Ding & Hang Chen, 2023. "Large Language Models in Finance: A Survey," Papers 2311.10723, arXiv.org, revised Jul 2024.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Benjamin Coriat & Eric Benhamou, 2025. "HARLF: Hierarchical Reinforcement Learning and Lightweight LLM-Driven Sentiment Integration for Financial Portfolio Optimization," Papers 2507.18560, arXiv.org.
    2. Hongshen Sun & Juanjuan Zhang, 2025. "From Model Choice to Model Belief: Establishing a New Measure for LLM-Based Research," Papers 2512.23184, arXiv.org.
    3. Tirulo, Aschalew & Yadav, Monika & Lolamo, Mathewos & Chauhan, Siddhartha & Siano, Pierluigi & Shafie-khah, Miadreza, 2026. "Beyond automation: Unveiling the potential of agentic intelligence," Renewable and Sustainable Energy Reviews, Elsevier, vol. 226(PA).
    4. Jimin Huang & Mengxi Xiao & Dong Li & Zihao Jiang & Yuzhe Yang & Yifei Zhang & Lingfei Qian & Yan Wang & Xueqing Peng & Yang Ren & Ruoyu Xiang & Zhengyu Chen & Xiao Zhang & Yueru He & Weiguang Han & S, 2024. "Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications," Papers 2408.11878, arXiv.org, revised Jun 2025.
    5. Yangrui Yang & Pengfei Wang & Xuemei Liu & Wenyu Luo & Libo Yang, 2025. "A Multi-Agent and GraphRAG-Based Framework for Operation and Management Decision-Making in Hydraulic Projects," Water Resources Management: An International Journal, Published for the European Water Resources Association (EWRA), Springer;European Water Resources Association (EWRA), vol. 39(14), pages 7665-7687, November.
    6. Antoine Kazadi Kayisu & Miroslava Mikusova & Pitshou Ntambu Bokoro & Kyandoghere Kyamakya, 2024. "Exploring Smart Mobility Potential in Kinshasa (DR-Congo) as a Contribution to Mastering Traffic Congestion and Improving Road Safety: A Comprehensive Feasibility Assessment," Sustainability, MDPI, vol. 16(21), pages 1-53, October.
    7. Anna Nesvijevskaia, 2025. "Sustaining the practitioners’ tacit knowledge in the age of Artificial Intelligence: new challenges emerging through multiple case study [La pérennisation du savoir tacite des acteurs métier à l’ère de l’Intelligence Artificielle : émergence d’enj," Post-Print hal-05413113, HAL.
    8. Haoyi Zhang & Tianyi Zhu, 2025. "Neither Consent nor Property: A Policy Lab for Data Law," Papers 2510.26727, arXiv.org, revised Jan 2026.
    9. Gavin Shaddick & David Topping & Tristram C. Hales & Usama Kadri & Joanne Patterson & John Pickett & Ioan Petri & Stuart Taylor & Peiyuan Li & Ashish Sharma & Venkat Venkatkrishnan & Abhinav Wadhwa & , 2025. "Data Science and AI for Sustainable Futures: Opportunities and Challenges," Sustainability, MDPI, vol. 17(5), pages 1-20, February.
    10. Karmaker, Ashish Kumar & Sturmberg, Bjorn & Behrens, Sam & Pota, Hemanshu Roy, 2025. "Customer-centric meso-level planning for electric vehicle charger distribution," Applied Energy, Elsevier, vol. 389(C).
    11. Weixian Waylon Li & Hyeonjun Kim & Mihai Cucuringu & Tiejun Ma, 2025. "Can LLM-based Financial Investing Strategies Outperform the Market in Long Run?," Papers 2505.07078, arXiv.org, revised Feb 2026.
    12. Xiaozheng Du & Shijing Hu & Feng Zhou & Cheng Wang & Binh Minh Nguyen, 2025. "FI-NL2PY2SQL: Financial Industry NL2SQL Innovation Model Based on Python and Large Language Model," Future Internet, MDPI, vol. 17(1), pages 1-24, January.
    13. Joel R. Bock, 2024. "Generating long-horizon stock "buy" signals with a neural language model," Papers 2410.18988, arXiv.org.
    14. Han Ding & Yinheng Li & Junhao Wang & Hang Chen & Doudou Guo & Yunbai Zhang, 2024. "Large Language Model Agent in Financial Trading: A Survey," Papers 2408.06361, arXiv.org, revised Mar 2026.
    15. Tianyi Zhang & Mu Chen, 2025. "Personalized Chain-of-Thought Summarization of Financial News for Investor Decision Support," Papers 2511.05508, arXiv.org, revised Nov 2025.
    16. Jinzhong Xu & Junyi Gao & Xiaoming Liu & Guan Yang & Jie Liu & Yang Long & Ziyue Huang & Kai Yang, 2025. "CPEL: A Causality-Aware, Parameter-Efficient Learning Framework for Adaptation of Large Language Models with Case Studies in Geriatric Care and Beyond," Mathematics, MDPI, vol. 13(15), pages 1-24, July.
    17. Marcin Andrzejewski & Nina Dubicka & Jędrzej Podolak & Marek Kowal & Jakub Siłka, 2025. "Automated Test Generation Using Large Language Models," Data, MDPI, vol. 10(10), pages 1-20, September.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2603.10807. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.