IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2604.26747.html

From Hypotheses to Factors: Constrained LLM Agents in Cryptocurrency Markets

Author

Listed:
  • Yikuan Huang
  • Zheqi Fan
  • Kaiqi Hu
  • Yifan Ye

Abstract

LLM agents are promising tools for empirical discovery, but their flexibility can also turn discovery into uncontrolled search. We study how to use agents under a reproducible protocol through cryptocurrency factor discovery. Our framework casts the task as sequential hypothesis search: an agent reads an append-only experiment trace, proposes falsifiable factor hypotheses, and maps them to executable recipes, while a deterministic engine enforces fixed data splits, selection gates, transaction costs, and portfolio tests. Candidate actions are restricted to a point-in-time factor DSL, making both successful and failed hypotheses auditable. A ridge-combined portfolio trained only on 2020--2022 data achieves a 44.55% annualized return and Sharpe ratio of 1.55 in the 2024--2026 pure out-of-sample period after a 5 basis point one-way trading cost.

Suggested Citation

  • Yikuan Huang & Zheqi Fan & Kaiqi Hu & Yifan Ye, 2026. "From Hypotheses to Factors: Constrained LLM Agents in Cryptocurrency Markets," Papers 2604.26747, arXiv.org.
  • Handle: RePEc:arx:papers:2604.26747
    as

    Download full text from publisher

    File URL: https://arxiv.org/pdf/2604.26747
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Andrew W. Lo & Harry Mamaysky & Jiang Wang, 2000. "Foundations of Technical Analysis: Computational Algorithms, Statistical Inference, and Empirical Implementation," Journal of Finance, American Finance Association, vol. 55(4), pages 1705-1765, August.
    2. Fan Fang & Carmine Ventre & Michail Basios & Leslie Kanthan & David Martinez-Rego & Fan Wu & Lingbo Li, 2022. "Cryptocurrency trading: a comprehensive survey," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 8(1), pages 1-59, December.
    3. Shihao Gu & Bryan Kelly & Dacheng Xiu, 2020. "Empirical Asset Pricing via Machine Learning," Review of Finance, European Finance Association, vol. 33(5), pages 2223-2273.
    4. Jian Chen & Guohao Tang & Guofu Zhou & Wu Zhu, 2025. "ChatGPT and Deepseek: Can They Predict the Stock Market and Macroeconomy?," Papers 2502.10008, arXiv.org.
    5. Brogaard, Jonathan & Zareei, Abalfazl, 2023. "Machine Learning and the Stock Market," Journal of Financial and Quantitative Analysis, Cambridge University Press, vol. 58(4), pages 1431-1472, June.
    6. Fama, Eugene F. & French, Kenneth R., 2015. "A five-factor asset pricing model," Journal of Financial Economics, Elsevier, vol. 116(1), pages 1-22.
    7. Doron Avramov & Si Cheng & Lior Metzker, 2023. "Machine Learning vs. Economic Restrictions: Evidence from Stock Return Predictability," Management Science, INFORMS, vol. 69(5), pages 2587-2619, May.
    8. Adam Baybutt, 2024. "Empirical Crypto Asset Pricing," Papers 2405.15716, arXiv.org.
    9. Yukun Liu & Aleh Tsyvinski & Xi Wu, 2022. "Common Risk Factors in Cryptocurrency," Journal of Finance, American Finance Association, vol. 77(2), pages 1133-1177, April.
    10. Grobys, Klaus & Ahmed, Shaker & Sapkota, Niranjan, 2020. "Technical trading rules in the cryptocurrency market," Finance Research Letters, Elsevier, vol. 32(C).
    11. Paul C. Tetlock, 2007. "Giving Content to Investor Sentiment: The Role of Media in the Stock Market," Journal of Finance, American Finance Association, vol. 62(3), pages 1139-1168, June.
    12. Fan Fang & Carmine Ventre & Michail Basios & Leslie Kanthan & Lingbo Li & David Martinez-Regoband & Fan Wu, 2020. "Cryptocurrency Trading: A Comprehensive Survey," Papers 2003.11352, arXiv.org, revised Jan 2022.
    13. Shijie Wu & Ozan Irsoy & Steven Lu & Vadim Dabravolski & Mark Dredze & Sebastian Gehrmann & Prabhanjan Kambadur & David Rosenberg & Gideon Mann, 2023. "BloombergGPT: A Large Language Model for Finance," Papers 2303.17564, arXiv.org, revised Dec 2023.
    14. Shihao Gu & Bryan Kelly & Dacheng Xiu, 2020. "Empirical Asset Pricing via Machine Learning," The Review of Financial Studies, Society for Financial Studies, vol. 33(5), pages 2223-2273.
    15. Allen Yikuan Huang & Zheqi Fan, 2026. "Beyond Prompting: An Autonomous Framework for Systematic Factor Investing via Agentic AI," Papers 2603.14288, arXiv.org, revised Apr 2026.
    16. Songrun He & Linying Lv & Asaf Manela & Jimmy Wu, 2025. "Chronologically Consistent Large Language Models," Papers 2502.21206, arXiv.org, revised Jul 2025.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Allen Yikuan Huang & Zheqi Fan, 2026. "Beyond Prompting: An Autonomous Framework for Systematic Factor Investing via Agentic AI," Papers 2603.14288, arXiv.org, revised Apr 2026.
    2. Fieberg, Christian & Liedtke, Gerrit & Zaremba, Adam, 2024. "Cryptocurrency anomalies and economic constraints," International Review of Financial Analysis, Elsevier, vol. 94(C).
    3. Cakici, Nusret & Zaremba, Adam, 2025. "Accounting vs technical information: what matters more for stock return predictability?," Journal of International Financial Markets, Institutions and Money, Elsevier, vol. 104(C).
    4. Yichen Luo & Yebo Feng & Jiahua Xu & Paolo Tasca & Yang Liu, 2025. "LLM-Powered Multi-Agent System for Automated Crypto Portfolio Management," Papers 2501.00826, arXiv.org, revised Jun 2026.
    5. Yuhan Cheng & Heyang Zhou & Yanchu Liu, 2025. "Large Language Models and Futures Price Factors in China," Papers 2509.23609, arXiv.org.
    6. Saketh Aleti & Tim Bollerslev & Mathias Siggaard, 2025. "Intraday Market Return Predictability Culled from the Factor Zoo," Management Science, INFORMS, vol. 71(9), pages 7731-7751, September.
    7. Cong Wang, 2024. "Stock return prediction with multiple measures using neural network models," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 10(1), pages 1-34, December.
    8. Gang Kou & Yang Lu, 2025. "FinTech: a literature review of emerging financial technologies and applications," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 11(1), pages 1-34, December.
    9. Obaid, Khaled & Pukthuanthong, Kuntara, 2022. "A picture is worth a thousand words: Measuring investor sentiment by combining machine learning and photos from news," Journal of Financial Economics, Elsevier, vol. 144(1), pages 273-297.
    10. Paul Handro & Bogdan Dima, 2024. "Analyzing Financial Markets Efficiency: Insights from a Bibliometric and Content Review," Journal of Financial Studies, Institute of Financial Studies, vol. 16(9), pages 119-175, May.
    11. Wu, Hongxu & Wang, Qiao & Li, Jianping & Deng, Zhibin, 2025. "Enhancing stock return prediction in the Chinese market: A GAN-based approach," Research in International Business and Finance, Elsevier, vol. 75(C).
    12. Maher Hamid, 2026. "Implementing domain-specific LLMs for strategic investment decisions: a retrospective case study comparing AI and human expertise," Digital Finance, Springer, vol. 8(1), pages 1-134, March.
    13. Linying Lv, 2025. "Do Sell-side Analyst Reports Have Investment Value?," Papers 2502.20489, arXiv.org, revised Aug 2025.
    14. Wolfgang Breuer & Andreas Knetsch, 2023. "Recent trends in the digitalization of finance and accounting," Journal of Business Economics, Springer, vol. 93(9), pages 1451-1461, November.
    15. Doron Avramov & Guy Kaplanski & Avanidhar Subrahmanyam, 2022. "Postfundamentals Price Drift in Capital Markets: A Regression Regularization Perspective," Management Science, INFORMS, vol. 68(10), pages 7658-7681, October.
    16. Witter, Johannes, 2025. "Predicting stock returns with machine learning: Global versus sector models," Junior Management Science (JUMS), Junior Management Science e. V., vol. 10(3), pages 561-581.
    17. Lin William Cong & Guanhao Feng & Jingyu He & Xin He, 2022. "Growing the Efficient Frontier on Panel Trees," NBER Working Papers 30805, National Bureau of Economic Research, Inc.
    18. Jinghai He & Cheng Hua & Chunyang Zhou & Zeyu Zheng, 2025. "Reinforcement-Learning Portfolio Allocation with Dynamic Embedding of Market Information," Papers 2501.17992, arXiv.org.
    19. Jinbo Cai & Wenze Li & Wenjie Wang, 2025. "Electricity Market Predictability: Virtues of Machine Learning and Links to the Macroeconomy," Papers 2507.07477, arXiv.org.
    20. Sak, Halis & Huang, Tao & Chng, Michael T., 2024. "Exploring the factor zoo with a machine-learning portfolio," International Review of Financial Analysis, Elsevier, vol. 96(PA).

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2604.26747. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: https://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.