IDEAS home Printed from https://ideas.repec.org/p/cpr/ceprdp/14235.html
   My bibliography  Save this paper

Market Efficiency in the Age of Big Data

Author

Listed:
  • Martin, Ian
  • Nagel, Stefan

Abstract

Modern investors face a high-dimensional prediction problem: thousands of observable variables are potentially relevant for forecasting. We reassess the conventional wisdom on market efficiency in light of this fact. In our model economy, which resembles a typical machine learning setting, N assets have cash flows that are a linear function of J firm characteristics, but with uncertain coefficients. Risk-neutral Bayesian investors impose shrinkage (ridge regression) or sparsity (Lasso) when they estimate the J coefficients of the model and use them to price assets. When J is comparable in size to N, returns appear cross-sectionally predictable using firm characteristics to an econometrician who analyzes data from the economy ex post. A factor zoo emerges even without p-hacking and data-mining. Standard in-sample tests of market efficiency reject the no-predictability null with high probability, despite the fact that investors optimally use the information available to them in real time. In contrast, out-of-sample tests retain their economic meaning.

Suggested Citation

  • Martin, Ian & Nagel, Stefan, 2019. "Market Efficiency in the Age of Big Data," CEPR Discussion Papers 14235, C.E.P.R. Discussion Papers.
  • Handle: RePEc:cpr:ceprdp:14235
    as

    Download full text from publisher

    File URL: https://cepr.org/publications/DP14235
    Download Restriction: CEPR Discussion Papers are free to download for our researchers, subscribers and members. If you fall into one of these categories but have trouble downloading our papers, please contact us at subscribers@cepr.org
    ---><---

    As the access to this document is restricted, you may want to look for a different version below or search for a different version of it.

    Other versions of this item:

    References listed on IDEAS

    as
    1. Chinco, Alex & Neuhierl, Andreas & Weber, Michael, 2021. "Estimating the anomaly base rate," Journal of Financial Economics, Elsevier, vol. 140(1), pages 101-126.
    2. Juhani T Linnainmaa & Michael R Roberts, 2018. "The History of the Cross-Section of Stock Returns," The Review of Financial Studies, Society for Financial Studies, vol. 31(7), pages 2606-2649.
    3. Xavier Gabaix, 2014. "A Sparsity-Based Model of Bounded Rationality," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 129(4), pages 1661-1710.
    4. Guanhao Feng & Stefano Giglio & Dacheng Xiu, 2020. "Taming the Factor Zoo: A Test of New Factors," Journal of Finance, American Finance Association, vol. 75(3), pages 1327-1370, June.
    5. Kozak, Serhiy & Nagel, Stefan & Santosh, Shrihari, 2020. "Shrinking the cross-section," Journal of Financial Economics, Elsevier, vol. 135(2), pages 271-292.
    6. John H. Cochrane, 2011. "Presidential Address: Discount Rates," Journal of Finance, American Finance Association, vol. 66(4), pages 1047-1108, August.
    7. Peter Reinhard Hansen & Allan Timmermann, 2015. "Equivalence Between Out‐of‐Sample Forecast Comparisons and Wald Statistics," Econometrica, Econometric Society, vol. 83, pages 2485-2505, November.
    8. Anatolyev, Stanislav, 2012. "Inference in regression models with many regressors," Journal of Econometrics, Elsevier, vol. 170(2), pages 368-382.
    9. Lo, Andrew W & MacKinlay, A Craig, 1990. "Data-Snooping Biases in Tests of Financial Asset Pricing Models," The Review of Financial Studies, Society for Financial Studies, vol. 3(3), pages 431-467.
    10. Pierre Collin-Dufresne & Michael Johannes & Lars A. Lochstoer, 2016. "Parameter Learning in General Equilibrium: The Asset Pricing Implications," American Economic Review, American Economic Association, vol. 106(3), pages 664-698, March.
    11. Allan G. Timmermann, 1993. "How Learning in Financial Markets Generates Excess Volatility and Predictability in Stock Prices," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 108(4), pages 1135-1145.
    12. Emilio Calvano & Giacomo Calzolari & Vincenzo Denicolò & Sergio Pastorello, 2020. "Artificial Intelligence, Algorithmic Pricing, and Collusion," American Economic Review, American Economic Association, vol. 110(10), pages 3267-3297, October.
    13. Atsushi Inoue & Lutz Kilian, 2005. "In-Sample or Out-of-Sample Tests of Predictability: Which One Should We Use?," Econometric Reviews, Taylor & Francis Journals, vol. 23(4), pages 371-402.
    14. John H. Cochrane, 2008. "The Dog That Did Not Bark: A Defense of Return Predictability," The Review of Financial Studies, Society for Financial Studies, vol. 21(4), pages 1533-1575, July.
    15. John Y. Campbell & Samuel B. Thompson, 2008. "Predicting Excess Stock Returns Out of Sample: Can Anything Beat the Historical Average?," The Review of Financial Studies, Society for Financial Studies, vol. 21(4), pages 1509-1531, July.
    16. R. David Mclean & Jeffrey Pontiff, 2016. "Does Academic Research Destroy Stock Return Predictability?," Journal of Finance, American Finance Association, vol. 71(1), pages 5-32, February.
    17. Nabil I. Al-Najjar, 2009. "Decision Makers as Statisticians: Diversity, Ambiguity, and Learning," Econometrica, Econometric Society, vol. 77(5), pages 1371-1401, September.
    18. De Bondt, Werner F M & Thaler, Richard, 1985. "Does the Stock Market Overreact?," Journal of Finance, American Finance Association, vol. 40(3), pages 793-805, July.
    19. Timo Klein, 2018. "Autonomous Algorithmic Collusion: Q-Learning Under Sequantial Pricing," Tinbergen Institute Discussion Papers 18-056/VII, Tinbergen Institute, revised 01 Nov 2020.
    20. Heston, Steven L. & Sadka, Ronnie, 2008. "Seasonality in the cross-section of stock returns," Journal of Financial Economics, Elsevier, vol. 87(2), pages 418-445, February.
    21. Guo Wenge & Romano Joseph, 2007. "A Generalized Sidak-Holm Procedure and Control of Generalized Error Rates under Independence," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 6(1), pages 1-35, January.
    22. Louis K. C. Chan & Jason Karceski & Josef Lakonishok, 2003. "The Level and Persistence of Growth Rates," Journal of Finance, American Finance Association, vol. 58(2), pages 643-684, April.
    23. Bai, Z. D. & Silverstein, Jack W. & Yin, Y. Q., 1988. "A note on the largest eigenvalue of a large dimensional sample covariance matrix," Journal of Multivariate Analysis, Elsevier, vol. 26(2), pages 166-168, August.
    24. Fama, Eugene F, 1970. "Efficient Capital Markets: A Review of Theory and Empirical Work," Journal of Finance, American Finance Association, vol. 25(2), pages 383-417, May.
    25. Sims, Christopher A., 2003. "Implications of rational inattention," Journal of Monetary Economics, Elsevier, vol. 50(3), pages 665-690, April.
    26. Novy-Marx, Robert, 2012. "Is momentum really momentum?," Journal of Financial Economics, Elsevier, vol. 103(3), pages 429-453.
    27. Jegadeesh, Narasimhan & Titman, Sheridan, 1993. "Returns to Buying Winners and Selling Losers: Implications for Stock Market Efficiency," Journal of Finance, American Finance Association, vol. 48(1), pages 65-91, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jérôme Dugast & Thierry Foucault, 2020. "Equilibrium Data Mining and Data Abundance," Post-Print hal-02933315, HAL.
    2. Yabu, Takuya, 2023. "On Discrete Probability Distributions to Grasp the Number of Samples in a Population," OSF Preprints yv24f, Center for Open Science.
    3. Svetlana Bryzgalova & Jiantao Huang & Christian Julliard, 2023. "Bayesian Solutions for the Factor Zoo: We Just Ran Two Quadrillion Models," Journal of Finance, American Finance Association, vol. 78(1), pages 487-557, February.
    4. Melina & Sukono & Herlina Napitupulu & Norizan Mohamed, 2023. "A Conceptual Model of Investment-Risk Prediction in the Stock Market Using Extreme Value Theory with Machine Learning: A Semisystematic Literature Review," Risks, MDPI, vol. 11(3), pages 1-24, March.
    5. Kaplanski, Guy, 2023. "The race to exploit anomalies and the cost of slow trading," Journal of Financial Markets, Elsevier, vol. 62(C).
    6. Zhang, Junsheng & Peng, Zezhi & Zeng, Yamin & Yang, Haisheng, 2023. "Do big data mutual funds outperform?," Journal of International Financial Markets, Institutions and Money, Elsevier, vol. 88(C).
    7. Carter Davis, 2023. "The Elasticity of Quantitative Investment," Papers 2303.14533, arXiv.org, revised Sep 2024.
    8. Goodarzi, Milad & Meinerding, Christoph, 2023. "Asset allocation with recursive parameter updating and macroeconomic regime identifiers," Discussion Papers 06/2023, Deutsche Bundesbank.
    9. Wang, Jing & Yu, Huaying & Ren, Daowen & Zhang, Jocelyn, 2023. "Promoting mineral resources consumption efficiency: Evidence from technology of big data," Resources Policy, Elsevier, vol. 86(PB).
    10. Xi Dong & Yan Li & David E. Rapach & Guofu Zhou, 2022. "Anomalies and the Expected Market Return," Journal of Finance, American Finance Association, vol. 77(1), pages 639-681, February.
    11. Hoang, Daniel & Wiegratz, Kevin, 2022. "Machine learning methods in finance: Recent applications and prospects," Working Paper Series in Economics 158, Karlsruhe Institute of Technology (KIT), Department of Economics and Management.
    12. Bo Yan & Mengru Liang & Yinxin Zhao, 2024. "Market sentiment and price dynamics in weak markets: A comprehensive empirical analysis of the soybean meal option market," Journal of Futures Markets, John Wiley & Sons, Ltd., vol. 44(5), pages 744-766, May.
    13. Christopher G. Lamoureux & Huacheng Zhang, 2021. "An Empirical Assessment of Characteristics and Optimal Portfolios," Papers 2104.12975, arXiv.org, revised Feb 2024.
    14. Grammig, Joachim & Hanenberg, Constantin & Schlag, Christian & Sönksen, Jantje, 2020. "Diverging roads: Theory-based vs. machine learning-implied stock risk premia," University of Tübingen Working Papers in Business and Economics 130, University of Tuebingen, Faculty of Economics and Social Sciences, School of Business and Economics.
    15. James Yae & Yang Luo, 2023. "Robust monitoring machine: a machine learning solution for out-of-sample R $$^2$$ 2 -hacking in return predictability monitoring," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 9(1), pages 1-28, December.
    16. Garg, Karan, 2021. "Machines and Markets : Assessing the Impact of Algorithmic Trading on Financial Market Efficiency," Warwick-Monash Economics Student Papers 11, Warwick Monash Economics Student Papers.
    17. Wu, Fei & Hu, Yan & Shen, Me, 2024. "The color of FinTech: FinTech and corporate green transformation in China," International Review of Financial Analysis, Elsevier, vol. 94(C).
    18. Sonya Georgieva, 2023. "Application of Artificial Intelligence and Machine Learning in the Conduct of Monetary Policy by Central Banks," Economic Studies journal, Bulgarian Academy of Sciences - Economic Research Institute, issue 8, pages 177-199.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Adam Zaremba & Jacob Koby Shemer, 2018. "Price-Based Investment Strategies," Springer Books, Springer, number 978-3-319-91530-2, January.
    2. Chinco, Alex & Neuhierl, Andreas & Weber, Michael, 2021. "Estimating the anomaly base rate," Journal of Financial Economics, Elsevier, vol. 140(1), pages 101-126.
    3. Cakici, Nusret & Zaremba, Adam & Bianchi, Robert J. & Pham, Nga, 2021. "False discoveries in the anomaly research: New insights from the Stock Exchange of Melbourne (1927–1987)," Pacific-Basin Finance Journal, Elsevier, vol. 70(C).
    4. Söhnke M. Bartram & Harald Lohre & Peter F. Pope & Ananthalakshmi Ranganathan, 2021. "Navigating the factor zoo around the world: an institutional investor perspective," Journal of Business Economics, Springer, vol. 91(5), pages 655-703, July.
    5. Cederburg, Scott & O’Doherty, Michael S. & Wang, Feifei & Yan, Xuemin (Sterling), 2020. "On the performance of volatility-managed portfolios," Journal of Financial Economics, Elsevier, vol. 138(1), pages 95-117.
    6. Andrew Y. Chen & Tom Zimmermann, 2022. "Publication Bias in Asset Pricing Research," Papers 2209.13623, arXiv.org, revised Sep 2023.
    7. Andrew Y. Chen & Tom Zimmermann, 2022. "Open Source Cross-Sectional Asset Pricing," Critical Finance Review, now publishers, vol. 11(2), pages 207-264, May.
    8. Doron Avramov & Guy Kaplanski & Avanidhar Subrahmanyam, 2022. "Postfundamentals Price Drift in Capital Markets: A Regression Regularization Perspective," Management Science, INFORMS, vol. 68(10), pages 7658-7681, October.
    9. Matti Keloharju & Juhani T. Linnainmaa & Peter Nyberg, 2019. "Long-Term Discount Rates Do Not Vary Across Firms," NBER Working Papers 25579, National Bureau of Economic Research, Inc.
    10. Kewei Hou & Chen Xue & Lu Zhang, 2017. "Replicating Anomalies," NBER Working Papers 23394, National Bureau of Economic Research, Inc.
    11. Joachim Freyberger & Andreas Neuhierl & Michael Weber, 2020. "Dissecting Characteristics Nonparametrically," The Review of Financial Studies, Society for Financial Studies, vol. 33(5), pages 2326-2377.
    12. Wolfgang Drobetz & Tizian Otto, 2021. "Empirical asset pricing via machine learning: evidence from the European stock market," Journal of Asset Management, Palgrave Macmillan, vol. 22(7), pages 507-538, December.
    13. Rapach, David & Zhou, Guofu, 2013. "Forecasting Stock Returns," Handbook of Economic Forecasting, in: G. Elliott & C. Granger & A. Timmermann (ed.), Handbook of Economic Forecasting, edition 1, volume 2, chapter 0, pages 328-383, Elsevier.
    14. Wang, Feifei & Yan, Xuemin Sterling, 2021. "Downside risk and the performance of volatility-managed portfolios," Journal of Banking & Finance, Elsevier, vol. 131(C).
    15. Tobias Wiest, 2023. "Momentum: what do we know 30 years after Jegadeesh and Titman’s seminal paper?," Financial Markets and Portfolio Management, Springer;Swiss Society for Financial Market Research, vol. 37(1), pages 95-114, March.
    16. Geertsema, Paul & Lu, Helen, 2020. "The correlation structure of anomaly strategies," Journal of Banking & Finance, Elsevier, vol. 119(C).
    17. Hollstein, Fabian & Nguyen, Duc Binh Benno & Prokopczuk, Marcel & Wese Simen, Chardin, 2019. "International tail risk and World Fear," Journal of International Money and Finance, Elsevier, vol. 93(C), pages 244-259.
    18. Boudoukh, Jacob & Israel, Ronen & Richardson, Matthew, 2022. "Biases in long-horizon predictive regressions," Journal of Financial Economics, Elsevier, vol. 145(3), pages 937-969.
    19. Cakici, Nusret & Fieberg, Christian & Metko, Daniel & Zaremba, Adam, 2023. "Machine learning goes global: Cross-sectional return predictability in international stock markets," Journal of Economic Dynamics and Control, Elsevier, vol. 155(C).
    20. Hoang, Khoa & Huang, Ronghong & Truong, Helen, 2023. "Resurrecting the market factor: A case of data mining across international markets," Pacific-Basin Finance Journal, Elsevier, vol. 82(C).

    More about this item

    Keywords

    Market efficiency; Big data; Machine learning;
    All these keywords.

    JEL classification:

    • G10 - Financial Economics - - General Financial Markets - - - General (includes Measurement and Data)
    • G12 - Financial Economics - - General Financial Markets - - - Asset Pricing; Trading Volume; Bond Interest Rates
    • G14 - Financial Economics - - General Financial Markets - - - Information and Market Efficiency; Event Studies; Insider Trading
    • C11 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Bayesian Analysis: General
    • C12 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Hypothesis Testing: General
    • C58 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Financial Econometrics

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:cpr:ceprdp:14235. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: https://www.cepr.org .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.