IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2511.03628.html

LiveTradeBench: Seeking Real-World Alpha with Large Language Models

Author

Listed:
  • Haofei Yu
  • Fenghai Li
  • Jiaxuan You

Abstract

Large language models (LLMs) achieve strong performance across benchmarks--from knowledge quizzes and math reasoning to web-agent tasks--but these tests occur in static settings, lacking real dynamics and uncertainty. Consequently, they evaluate isolated reasoning or problem-solving rather than decision-making under uncertainty. To address this, we introduce LiveTradeBench, a live trading environment for evaluating LLM agents in realistic and evolving markets. LiveTradeBench follows three design principles: (i) Live data streaming of market prices and news, eliminating dependence on offline backtesting and preventing information leakage while capturing real-time uncertainty; (ii) a portfolio-management abstraction that extends control from single-asset actions to multi-asset allocation, integrating risk management and cross-asset reasoning; and (iii) multi-market evaluation across structurally distinct environments--U.S. stocks and Polymarket prediction markets--differing in volatility, liquidity, and information flow. At each step, an agent observes prices, news, and its portfolio, then outputs percentage allocations that balance risk and return. Using LiveTradeBench, we run 50-day live evaluations of 21 LLMs across families. Results show that (1) high LMArena scores do not imply superior trading outcomes; (2) models display distinct portfolio styles reflecting risk appetite and reasoning dynamics; and (3) some LLMs effectively leverage live signals to adapt decisions. These findings expose a gap between static evaluation and real-world competence, motivating benchmarks that test sequential decision making and consistency under live uncertainty.

Suggested Citation

  • Haofei Yu & Fenghai Li & Jiaxuan You, 2025. "LiveTradeBench: Seeking Real-World Alpha with Large Language Models," Papers 2511.03628, arXiv.org.
  • Handle: RePEc:arx:papers:2511.03628
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2511.03628
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Guojun Xiong & Zhiyang Deng & Keyi Wang & Yupeng Cao & Haohang Li & Yangyang Yu & Xueqing Peng & Mingquan Lin & Kaleb E Smith & Xiao-Yang Liu & Jimin Huang & Sophia Ananiadou & Qianqian Xie, 2025. "FLAG-Trader: Fusion LLM-Agent with Gradient-based Reinforcement Learning for Financial Trading," Papers 2502.11433, arXiv.org, revised Feb 2025.
    2. Tianping Zhang & Yuanqi Li & Yifei Jin & Jian Li, 2020. "AutoAlpha: an Efficient Hierarchical Evolutionary Algorithm for Mining Alpha Factors in Quantitative Investment," Papers 2002.08245, arXiv.org, revised Apr 2020.
    3. Yang Li & Yangyang Yu & Haohang Li & Zhi Chen & Khaldoun Khashanah, 2023. "TradingGPT: Multi-Agent System with Layered Memory and Distinct Characters for Enhanced Financial Trading Performance," Papers 2309.03736, arXiv.org.
    4. Shuo Sun & Rundong Wang & Bo An, 2021. "Reinforcement Learning for Quantitative Trading," Papers 2109.13851, arXiv.org.
    5. Yijia Xiao & Edward Sun & Tong Chen & Fang Wu & Di Luo & Wei Wang, 2025. "Trading-R1: Financial Trading with LLM Reasoning via Reinforcement Learning," Papers 2509.11420, arXiv.org.
    6. Yunan Ye & Hengzhi Pei & Boxin Wang & Pin-Yu Chen & Yada Zhu & Jun Xiao & Bo Li, 2020. "Reinforcement-Learning based Portfolio Management with Augmented Asset Movement Prediction States," Papers 2002.05780, arXiv.org.
    7. Ruoyu Sun & Angelos Stefanidis & Zhengyong Jiang & Jionglong Su, 2024. "Combining Transformer based Deep Reinforcement Learning with Black-Litterman Model for Portfolio Optimization," Papers 2402.16609, arXiv.org.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Shuo Sun & Rundong Wang & Bo An, 2021. "Reinforcement Learning for Quantitative Trading," Papers 2109.13851, arXiv.org.
    2. Hui Niu & Siyuan Li & Jian Li, 2022. "MetaTrader: An Reinforcement Learning Approach Integrating Diverse Policies for Portfolio Optimization," Papers 2210.01774, arXiv.org.
    3. Shuo Sun & Molei Qin & Xinrun Wang & Bo An, 2023. "PRUDEX-Compass: Towards Systematic Evaluation of Reinforcement Learning in Financial Markets," Papers 2302.00586, arXiv.org, revised Mar 2023.
    4. Mohammed-Khalil Ghali & Cecil Pang & Oscar Molina & Carlos Gershenson-Garcia & Daehan Won, 2025. "Forecasting Commodity Price Shocks Using Temporal and Semantic Fusion of Prices Signals and Agentic Generative AI Extracted Economic News," Papers 2508.06497, arXiv.org.
    5. Eric Benhamou & David Saltiel & Sandrine Ungari & Abhishek Mukhopadhyay & Jamal Atif, 2020. "AAMDRL: Augmented Asset Management with Deep Reinforcement Learning," Papers 2010.08497, arXiv.org.
    6. Wang, Jianzhou & Lv, Mengzheng & Wang, Shuai & Gao, Jialu & Zhao, Yang & Wang, Qiangqiang, 2024. "Can multi-period auto-portfolio systems improve returns? Evidence from Chinese and U.S. stock markets," International Review of Financial Analysis, Elsevier, vol. 95(PB).
    7. Frensi Zejnullahu & Maurice Moser & Joerg Osterrieder, 2022. "Applications of Reinforcement Learning in Finance -- Trading with a Double Deep Q-Network," Papers 2206.14267, arXiv.org.
    8. Yijia Xiao & Edward Sun & Tong Chen & Fang Wu & Di Luo & Wei Wang, 2025. "Trading-R1: Financial Trading with LLM Reasoning via Reinforcement Learning," Papers 2509.11420, arXiv.org.
    9. Liwei Deng & Tianfu Wang & Yan Zhao & Kai Zheng, 2024. "MILLION: A General Multi-Objective Framework with Controllable Risk for Portfolio Management," Papers 2412.03038, arXiv.org.
    10. Kruthof, Garvin & Müller, Sebastian, 2025. "Can deep reinforcement learning beat 1N," Finance Research Letters, Elsevier, vol. 75(C).
    11. Xiao-Yang Liu & Jingyang Rui & Jiechao Gao & Liuqing Yang & Hongyang Yang & Zhaoran Wang & Christina Dan Wang & Jian Guo, 2021. "FinRL-Meta: A Universe of Near-Real Market Environments for Data-Driven Deep Reinforcement Learning in Quantitative Finance," Papers 2112.06753, arXiv.org, revised Mar 2022.
    12. Shuyang Wang & Diego Klabjan, 2023. "An Ensemble Method of Deep Reinforcement Learning for Automated Cryptocurrency Trading," Papers 2309.00626, arXiv.org.
    13. Feng Xu & Yan Yin & Xinyu Zhang & Tianyuan Liu & Shengyi Jiang & Zongzhang Zhang, 2024. "$\text{Alpha}^2$: Discovering Logical Formulaic Alphas using Deep Reinforcement Learning," Papers 2406.16505, arXiv.org, revised Jun 2024.
    14. Quechen Yang, 2024. "Blending Ensemble for Classification with Genetic-algorithm generated Alpha factors and Sentiments (GAS)," Papers 2411.03035, arXiv.org.
    15. Kassiani Papasotiriou & Srijan Sood & Shayleen Reynolds & Tucker Balch, 2024. "AI in Investment Analysis: LLMs for Equity Stock Ratings," Papers 2411.00856, arXiv.org.
    16. Tao Ren & Ruihan Zhou & Jinyang Jiang & Jiafeng Liang & Qinghao Wang & Yijie Peng, 2024. "RiskMiner: Discovering Formulaic Alphas via Risk Seeking Monte Carlo Tree Search," Papers 2402.07080, arXiv.org, revised Feb 2024.
    17. Weizhe Ren & Yichen Qin & Yang Li, 2024. "Alpha Mining and Enhancing via Warm Start Genetic Programming for Quantitative Investment," Papers 2412.00896, arXiv.org.
    18. Wentao Zhang & Lingxuan Zhao & Haochong Xia & Shuo Sun & Jiaze Sun & Molei Qin & Xinyi Li & Yuqing Zhao & Yilei Zhao & Xinyu Cai & Longtao Zheng & Xinrun Wang & Bo An, 2024. "A Multimodal Foundation Agent for Financial Trading: Tool-Augmented, Diversified, and Generalist," Papers 2402.18485, arXiv.org, revised Jun 2024.
    19. Qiu, Zhiyuan & Mou, Yilin & Li, Yutong, 2025. "The impact of rural upbringing on household risky financial asset allocation: An analysis based on CHFS," International Review of Economics & Finance, Elsevier, vol. 97(C).
    20. Guojun Xiong & Zhiyang Deng & Keyi Wang & Yupeng Cao & Haohang Li & Yangyang Yu & Xueqing Peng & Mingquan Lin & Kaleb E Smith & Xiao-Yang Liu & Jimin Huang & Sophia Ananiadou & Qianqian Xie, 2025. "FLAG-Trader: Fusion LLM-Agent with Gradient-based Reinforcement Learning for Financial Trading," Papers 2502.11433, arXiv.org, revised Feb 2025.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2511.03628. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.