IDEAS home Printed from https://ideas.repec.org/p/zur/iewwpx/259.html
   My bibliography  Save this paper

Formalized Data Snooping Based on Generalized Error Rates

Author

Listed:
  • Joseph P
  • Romano
  • Azeem M. Shaikh
  • Michael Wolf

Abstract

It is common in econometric applications that several hypothesis tests are carried out at the same time. The problem then becomes how to decide which hypotheses to reject, accounting for the multitude of tests. The classical approach is to control the familywise error rate (FWE), that is, the probability of one or more false rejections. But when the number of hypotheses under consideration is large, control of the FWE can become too demanding. As a result, the number of false hypotheses rejected may be small or even zero. This suggests replacing control of the FWE by a more liberal measure. To this end, we review a number of proposals from the statistical literature. We briefly discuss how these procedures apply to the general problem of model selection. A simulation study and two empirical applications illustrate the methods.

Suggested Citation

  • Joseph P & Romano & Azeem M. Shaikh & Michael Wolf, 2005. "Formalized Data Snooping Based on Generalized Error Rates," IEW - Working Papers 259, Institute for Empirical Research in Economics - University of Zurich.
  • Handle: RePEc:zur:iewwpx:259
    as

    Download full text from publisher

    File URL: https://www.econ.uzh.ch/apps/workingpapers/wp/iewwp259.pdf
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Joseph P. Romano & Michael Wolf, 2005. "Stepwise Multiple Testing as Formalized Data Snooping," Econometrica, Econometric Society, vol. 73(4), pages 1237-1282, July.
    2. Romano, Joseph P. & Wolf, Michael, 2001. "Improved nonparametric confidence intervals in time series regressions," DES - Working Papers. Statistics and Econometrics. WS ws010201, Universidad Carlos III de Madrid. Departamento de Estadística.
    3. Krolzig, Hans-Martin & Hendry, David F., 2001. "Computer automation of general-to-specific model selection procedures," Journal of Economic Dynamics and Control, Elsevier, vol. 25(6-7), pages 831-866, June.
    4. Hansen, Peter Reinhard, 2005. "A Test for Superior Predictive Ability," Journal of Business & Economic Statistics, American Statistical Association, vol. 23, pages 365-380, October.
    5. Ryan Sullivan & Allan Timmermann & Halbert White, 1999. "Data‐Snooping, Technical Trading Rule Performance, and the Bootstrap," Journal of Finance, American Finance Association, vol. 54(5), pages 1647-1691, October.
    6. Sullivan, Ryan & Timmermann, Allan & White, Halbert, 2001. "Dangers of data mining: The case of calendar effects in stock returns," Journal of Econometrics, Elsevier, vol. 105(1), pages 249-286, November.
    7. Joseph P. Romano & Michael Wolf, "undated". "Control of Generalized Error Rates in Multiple Testing," IEW - Working Papers 245, Institute for Empirical Research in Economics - University of Zurich.
    8. Joseph P. Romano & Michael Wolf, 2005. "Exact and Approximate Stepdown Methods for Multiple Hypothesis Testing," Journal of the American Statistical Association, American Statistical Association, vol. 100, pages 94-108, March.
    9. Timmermann, Allan, 2006. "Forecast Combinations," Handbook of Economic Forecasting, in: G. Elliott & C. Granger & A. Timmermann (ed.), Handbook of Economic Forecasting, edition 1, volume 1, chapter 4, pages 135-196, Elsevier.
    10. Peter Reinhard Hansen & Asger Lunde & James M. Nason, 2005. "Model confidence sets for forecasting models," FRB Atlanta Working Paper 2005-07, Federal Reserve Bank of Atlanta.
    11. Andrews, Donald W K & Monahan, J Christopher, 1992. "An Improved Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimator," Econometrica, Econometric Society, vol. 60(4), pages 953-966, July.
    12. Xiaotong Shen & Hsin-Cheng Huang & Jimmy Ye, 2004. "Inference After Model Selection," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 751-762, January.
    13. Peter Reinhard Hansen & Asger Lunde & James M. Nason, 2003. "Choosing the Best Volatility Models: The Model Confidence Set Approach," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 65(s1), pages 839-861, December.
    14. Kabaila, Paul & Leeb, Hannes, 2006. "On the Large-Sample Minimal Coverage Probability of Confidence Intervals After Model Selection," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 619-629, June.
    15. Abramovich, Felix & Benjamini, Yoav, 1996. "Adaptive thresholding of wavelet coefficients," Computational Statistics & Data Analysis, Elsevier, vol. 22(4), pages 351-361, August.
    16. Hidetoshi Shimodaira, 1998. "An Application of Multiple Comparison Techniques to Model Selection," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 50(1), pages 1-13, March.
    17. Joseph P. Romano & Azeem M. Shaikh & Michael Wolf, 2010. "multiple testing," The New Palgrave Dictionary of Economics,, Palgrave Macmillan.
    18. G. Elliott & C. Granger & A. Timmermann (ed.), 2006. "Handbook of Economic Forecasting," Handbook of Economic Forecasting, Elsevier, edition 1, volume 1, number 1.
    19. Halbert White, 2000. "A Reality Check for Data Snooping," Econometrica, Econometric Society, vol. 68(5), pages 1097-1126, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Davide De Gaetano, 2016. "Forecast Combinations For Realized Volatility In Presence Of Structural Breaks," Departmental Working Papers of Economics - University 'Roma Tre' 0208, Department of Economics - University Roma Tre.
    2. Dichtl, Hubert & Drobetz, Wolfgang & Neuhierl, Andreas & Wendt, Viktoria-Sophie, 2021. "Data snooping in equity premium prediction," International Journal of Forecasting, Elsevier, vol. 37(1), pages 72-94.
    3. Jin, Sainan & Corradi, Valentina & Swanson, Norman R., 2017. "Robust Forecast Comparison," Econometric Theory, Cambridge University Press, vol. 33(6), pages 1306-1351, December.
    4. Kuang, P. & Schröder, M. & Wang, Q., 2014. "Illusory profitability of technical analysis in emerging foreign exchange markets," International Journal of Forecasting, Elsevier, vol. 30(2), pages 192-205.
    5. Hsu, Po-Hsuan & Han, Qiheng & Wu, Wensheng & Cao, Zhiguang, 2018. "Asset allocation strategies, data snooping, and the 1 / N rule," Journal of Banking & Finance, Elsevier, vol. 97(C), pages 257-269.
    6. Oleg Rytchkov & Xun Zhong, 2020. "Information Aggregation and P-Hacking," Management Science, INFORMS, vol. 66(4), pages 1605-1626, April.
    7. Stephen A. Gorman & Frank J. Fabozzi, 2021. "The ABC’s of the alternative risk premium: academic roots," Journal of Asset Management, Palgrave Macmillan, vol. 22(6), pages 405-436, October.
    8. Joseph Romano & Azeem Shaikh & Michael Wolf, 2008. "Control of the false discovery rate under dependence using the bootstrap and subsampling," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 17(3), pages 417-442, November.
    9. Raffaella Giacomini & Barbara Rossi, 2013. "Forecasting in macroeconomics," Chapters, in: Nigar Hashimzade & Michael A. Thornton (ed.), Handbook of Research Methods and Applications in Empirical Macroeconomics, chapter 17, pages 381-408, Edward Elgar Publishing.
    10. Joseph P. Romano & Azeem M. Shaikh & Michael Wolf, 2010. "Hypothesis Testing in Econometrics," Annual Review of Economics, Annual Reviews, vol. 2(1), pages 75-104, September.
    11. Adriano Koshiyama & Nick Firoozye, 2019. "Avoiding Backtesting Overfitting by Covariance-Penalties: an empirical investigation of the ordinary and total least squares cases," Papers 1905.05023, arXiv.org.
    12. Yang, Junmin & Cao, Zhiguang & Han, Qiheng & Wang, Qiyu, 2019. "Tactical asset allocation on technical trading rules and data snooping," Pacific-Basin Finance Journal, Elsevier, vol. 57(C).
    13. Hassanniakalager, Arman & Sermpinis, Georgios & Stasinakis, Charalampos, 2021. "Trading the foreign exchange market with technical analysis and Bayesian Statistics," Journal of Empirical Finance, Elsevier, vol. 63(C), pages 230-251.
    14. Rapach, David & Zhou, Guofu, 2013. "Forecasting Stock Returns," Handbook of Economic Forecasting, in: G. Elliott & C. Granger & A. Timmermann (ed.), Handbook of Economic Forecasting, edition 1, volume 2, chapter 0, pages 328-383, Elsevier.
    15. Kuang, P. & Schröder, M. & Wang, Q., 2014. "Illusory profitability of technical analysis in emerging foreign exchange markets," International Journal of Forecasting, Elsevier, vol. 30(2), pages 192-205.
    16. Hubert Dichtl & Wolfgang Drobetz & Viktoria‐Sophie Wendt, 2021. "How to build a factor portfolio: Does the allocation strategy matter?," European Financial Management, European Financial Management Association, vol. 27(1), pages 20-58, January.
    17. Jin, Xiaoye, 2022. "Performance of intraday technical trading in China’s gold market," Journal of International Financial Markets, Institutions and Money, Elsevier, vol. 76(C).
    18. Bajgrowicz, Pierre & Scaillet, Olivier, 2012. "Technical trading revisited: False discoveries, persistence tests, and transaction costs," Journal of Financial Economics, Elsevier, vol. 106(3), pages 473-491.
    19. Todd E. Clark & Michael W. McCracken, 2010. "Averaging forecasts from VARs with uncertain instabilities," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 25(1), pages 5-29, January.
    20. Psaradellis, Ioannis & Laws, Jason & Pantelous, Athanasios A. & Sermpinis, Georgios, 2023. "Technical analysis, spread trading, and data snooping control," International Journal of Forecasting, Elsevier, vol. 39(1), pages 178-191.

    More about this item

    Keywords

    Data snooping; false discovery proportion; false discovery rate; generalized familywise error rate; model selection; multiple testing; stepwise methods;
    All these keywords.

    JEL classification:

    • C12 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Hypothesis Testing: General
    • C14 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Semiparametric and Nonparametric Methods: General
    • C52 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Model Evaluation, Validation, and Selection

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:zur:iewwpx:259. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Severin Oswald (email available below). General contact details of provider: .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.