IDEAS home Printed from https://ideas.repec.org/a/eee/ejores/v307y2023i2p929-947.html
   My bibliography  Save this article

Pairs trading via unsupervised learning

Author

Listed:
  • Han, Chulwoo
  • He, Zhaodong
  • Toh, Alenson Jun Wei

Abstract

This paper develops a pairs trading strategy via unsupervised learning. Unlike conventional pairs trading strategies that identify pairs based on return time series, we identify pairs by incorporating firm characteristics as well as price information. Firm characteristics are revealed to provide important information for pair identification and significantly improve the performance of the pairs trading strategy. Applied to the US stock market from January 1980 to December 2020, the long-short portfolio constructed via the agglomerative clustering earns a statistically significant annualized mean return of 24.8% and a Sharpe ratio of 2.69. The strategy remains profitable after accounting for transaction costs and removing stocks below 20% NYSE-size quantile. A host of robustness tests confirm that the results are not driven by data snooping.

Suggested Citation

  • Han, Chulwoo & He, Zhaodong & Toh, Alenson Jun Wei, 2023. "Pairs trading via unsupervised learning," European Journal of Operational Research, Elsevier, vol. 307(2), pages 929-947.
  • Handle: RePEc:eee:ejores:v:307:y:2023:i:2:p:929-947
    DOI: 10.1016/j.ejor.2022.09.041
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S037722172200769X
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ejor.2022.09.041?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Krauss, Christopher & Do, Xuan Anh & Huck, Nicolas, 2017. "Deep neural networks, gradient-boosted trees, random forests: Statistical arbitrage on the S&P 500," European Journal of Operational Research, Elsevier, vol. 259(2), pages 689-702.
    2. Shihao Gu & Bryan Kelly & Dacheng Xiu, 2020. "Empirical Asset Pricing via Machine Learning," Review of Finance, European Finance Association, vol. 33(5), pages 2223-2273.
    3. Mark Cummins & Andrea Bucca, 2012. "Quantitative spread trading on crude oil and refined products markets," Quantitative Finance, Taylor & Francis Journals, vol. 12(12), pages 1857-1875, December.
    4. Ledoit, Oliver & Wolf, Michael, 2008. "Robust performance hypothesis testing with the Sharpe ratio," Journal of Empirical Finance, Elsevier, vol. 15(5), pages 850-859, December.
    5. Kaucic, Massimiliano, 2010. "Investment using evolutionary learning methods and technical rules," European Journal of Operational Research, Elsevier, vol. 207(3), pages 1717-1727, December.
    6. Huck, Nicolas, 2009. "Pairs selection and outranking: An application to the S&P 100 index," European Journal of Operational Research, Elsevier, vol. 196(2), pages 819-825, July.
    7. Jacobs, Heiko & Weber, Martin, 2015. "On the determinants of pairs trading profitability," Journal of Financial Markets, Elsevier, vol. 23(C), pages 75-97.
    8. Nicolas Huck & Komivi Afawubo, 2015. "Pairs trading and selection methods: is cointegration superior?," Applied Economics, Taylor & Francis Journals, vol. 47(6), pages 599-613, February.
    9. Christopher Krauss, 2017. "Statistical Arbitrage Pairs Trading Strategies: Review And Outlook," Journal of Economic Surveys, Wiley Blackwell, vol. 31(2), pages 513-545, April.
    10. Evan Gatev & William N. Goetzmann & K. Geert Rouwenhorst, 2006. "Pairs Trading: Performance of a Relative-Value Arbitrage Rule," Review of Financial Studies, Society for Financial Studies, vol. 19(3), pages 797-827.
    11. Shihao Gu & Bryan Kelly & Dacheng Xiu, 2020. "Empirical Asset Pricing via Machine Learning," Review of Financial Studies, Society for Financial Studies, vol. 33(5), pages 2223-2273.
    12. Christopher Krauss & Anh Do & Nicolas Huck, 2017. "Deep neural networks, gradient-boosted trees, random forests: Statistical arbitrage on the S&P 500," Post-Print hal-01768895, HAL.
    13. Jeremiah Green & John R. M. Hand & X. Frank Zhang, 2017. "The Characteristics that Provide Independent Information about Average U.S. Monthly Stock Returns," Review of Financial Studies, Society for Financial Studies, vol. 30(12), pages 4389-4436.
    14. Matthew Clegg & Christopher Krauss, 2018. "Pairs trading with partial cointegration," Quantitative Finance, Taylor & Francis Journals, vol. 18(1), pages 121-138, January.
    15. Dose, Christian & Cincotti, Silvano, 2005. "Clustering of financial time series with application to index and enhanced index tracking portfolio," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 355(1), pages 145-151.
    16. Scott R. Baker & Nicholas Bloom & Steven J. Davis & Kyle J. Kost & Marco C. Sammon & Tasaneeya Viratyosin, 2020. "The Unprecedented Stock Market Impact of COVID-19," NBER Working Papers 26945, National Bureau of Economic Research, Inc.
    17. Fischer, Thomas & Krauss, Christopher, 2018. "Deep learning with long short-term memory networks for financial market predictions," European Journal of Operational Research, Elsevier, vol. 270(2), pages 654-669.
    18. Engle, Robert & Granger, Clive, 2015. "Co-integration and error correction: Representation, estimation, and testing," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 39(3), pages 106-135.
    19. Chordia, Tarun & Subrahmanyam, Avanidhar & Tong, Qing, 2014. "Have capital market anomalies attenuated in the recent era of high liquidity and trading activity?," Journal of Accounting and Economics, Elsevier, vol. 58(1), pages 41-58.
    20. Fama, Eugene F. & French, Kenneth R., 2015. "A five-factor asset pricing model," Journal of Financial Economics, Elsevier, vol. 116(1), pages 1-22.
    21. Stephen Johnson, 1967. "Hierarchical clustering schemes," Psychometrika, Springer;The Psychometric Society, vol. 32(3), pages 241-254, September.
    22. Taewook Kim & Ha Young Kim, 2019. "Optimizing the Pairs-Trading Strategy Using Deep Reinforcement Learning with Trading and Stop-Loss Boundaries," Complexity, Hindawi, vol. 2019, pages 1-20, November.
    23. Huck, Nicolas, 2019. "Large data sets and machine learning: Applications to statistical arbitrage," European Journal of Operational Research, Elsevier, vol. 278(1), pages 330-342.
    24. Nicolas Huck, 2019. "Large data sets and machine learning: Applications to statistical arbitrage," Post-Print hal-02143971, HAL.
    25. Huck, Nicolas, 2010. "Pairs trading and outranking: The multi-step-ahead forecasting case," European Journal of Operational Research, Elsevier, vol. 207(3), pages 1702-1716, December.
    26. Nicolas Huck & Komivi Afawubo, 2015. "Pairs trading and selection methods: is cointegration superior?," Post-Print hal-01369852, HAL.
    27. Marco Avellaneda & Jeong-Hyun Lee, 2010. "Statistical arbitrage in the US equities market," Quantitative Finance, Taylor & Francis Journals, vol. 10(7), pages 761-782.
    28. Fama, Eugene F & French, Kenneth R, 1996. "Multifactor Explanations of Asset Pricing Anomalies," Journal of Finance, American Finance Association, vol. 51(1), pages 55-84, March.
    29. Hossein Rad & Rand Kwong Yew Low & Robert Faff, 2016. "The profitability of pairs trading strategies: distance, cointegration and copula methods," Quantitative Finance, Taylor & Francis Journals, vol. 16(10), pages 1541-1558, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Flori, Andrea & Regoli, Daniele, 2021. "Revealing Pairs-trading opportunities with long short-term memory networks," European Journal of Operational Research, Elsevier, vol. 295(2), pages 772-791.
    2. Matthew Clegg & Christopher Krauss, 2018. "Pairs trading with partial cointegration," Quantitative Finance, Taylor & Francis Journals, vol. 18(1), pages 121-138, January.
    3. Rubesam, Alexandre, 2022. "Machine learning portfolios with equal risk contributions: Evidence from the Brazilian market," Emerging Markets Review, Elsevier, vol. 51(PB).
    4. Fischer, Thomas & Krauss, Christopher, 2018. "Deep learning with long short-term memory networks for financial market predictions," European Journal of Operational Research, Elsevier, vol. 270(2), pages 654-669.
    5. Marianna Brunetti & Roberta De Luca, 2023. "Pairs trading in the index options market," Eurasian Economic Review, Springer;Eurasia Business and Economics Society, vol. 13(1), pages 145-173, March.
    6. Krauss, Christopher, 2015. "Statistical arbitrage pairs trading strategies: Review and outlook," FAU Discussion Papers in Economics 09/2015, Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics.
    7. Clegg, Matthew & Krauss, Christopher, 2016. "Pairs trading with partial cointegration," FAU Discussion Papers in Economics 05/2016, Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics.
    8. Erdinc Akyildirim & Ahmet Goncu & Alper Hekimoglu & Duc Khuong Nguyen & Ahmet Sensoy, 2023. "Statistical arbitrage: factor investing approach," OR Spectrum: Quantitative Approaches in Management, Springer;Gesellschaft für Operations Research e.V., vol. 45(4), pages 1295-1331, December.
    9. Johannes Stübinger & Sylvia Endres, 2018. "Pairs trading with a mean-reverting jump–diffusion model on high-frequency data," Quantitative Finance, Taylor & Francis Journals, vol. 18(10), pages 1735-1751, October.
    10. Knoll, Julian & Stübinger, Johannes & Grottke, Michael, 2017. "Exploiting social media with higher-order Factorization Machines: Statistical arbitrage on high-frequency data of the S&P 500," FAU Discussion Papers in Economics 13/2017, Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics.
    11. Stübinger, Johannes & Endres, Sylvia, 2017. "Pairs trading with a mean-reverting jump-diffusion model on high-frequency data," FAU Discussion Papers in Economics 10/2017, Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics.
    12. Kasper Johansson & Thomas Schmelzer & Stephen Boyd, 2024. "Finding Moving-Band Statistical Arbitrages via Convex-Concave Optimization," Papers 2402.08108, arXiv.org.
    13. Law, K.F. & Li, W.K. & Yu, Philip L.H., 2018. "A single-stage approach for cointegration-based pairs trading," Finance Research Letters, Elsevier, vol. 26(C), pages 177-184.
    14. Marianna Brunetti & Roberta De Luca, 2021. "Pairs Trading In The Index Options Market," CEIS Research Paper 512, Tor Vergata University, CEIS, revised 02 Sep 2021.
    15. Fabian Waldow & Matthias Schnaubelt & Christopher Krauss & Thomas Günter Fischer, 2021. "Machine Learning in Futures Markets," JRFM, MDPI, vol. 14(3), pages 1-14, March.
    16. Krauss, Christopher & Do, Xuan Anh & Huck, Nicolas, 2017. "Deep neural networks, gradient-boosted trees, random forests: Statistical arbitrage on the S&P 500," European Journal of Operational Research, Elsevier, vol. 259(2), pages 689-702.
    17. Rama Cont & Mihai Cucuringu & Chao Zhang, 2021. "Cross-Impact of Order Flow Imbalance in Equity Markets," Papers 2112.13213, arXiv.org, revised Jun 2023.
    18. Kolesnikova, A. & Yang, Y. & Lessmann, S. & Ma, T. & Sung, M.-C. & Johnson, J.E.V., 2019. "Can Deep Learning Predict Risky Retail Investors? A Case Study in Financial Risk Behavior Forecasting," IRTG 1792 Discussion Papers 2019-023, Humboldt University of Berlin, International Research Training Group 1792 "High Dimensional Nonstationary Time Series".
    19. Stübinger, Johannes, 2018. "Statistical arbitrage with optimal causal paths on high-frequencydata of the S&P 500," FAU Discussion Papers in Economics 01/2018, Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics.
    20. Schnaubelt, Matthias & Fischer, Thomas G. & Krauss, Christopher, 2020. "Separating the signal from the noise – Financial machine learning for Twitter," Journal of Economic Dynamics and Control, Elsevier, vol. 114(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:307:y:2023:i:2:p:929-947. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.