IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2008.03600.html
   My bibliography  Save this paper

Machine Learning Panel Data Regressions with Heavy-tailed Dependent Data: Theory and Application

Author

Listed:
  • Andrii Babii
  • Ryan T. Ball
  • Eric Ghysels
  • Jonas Striaukas

Abstract

The paper introduces structured machine learning regressions for heavy-tailed dependent panel data potentially sampled at different frequencies. We focus on the sparse-group LASSO regularization. This type of regularization can take advantage of the mixed frequency time series panel data structures and improve the quality of the estimates. We obtain oracle inequalities for the pooled and fixed effects sparse-group LASSO panel data estimators recognizing that financial and economic data can have fat tails. To that end, we leverage on a new Fuk-Nagaev concentration inequality for panel data consisting of heavy-tailed $\tau$-mixing processes.

Suggested Citation

  • Andrii Babii & Ryan T. Ball & Eric Ghysels & Jonas Striaukas, 2020. "Machine Learning Panel Data Regressions with Heavy-tailed Dependent Data: Theory and Application," Papers 2008.03600, arXiv.org, revised Nov 2021.
  • Handle: RePEc:arx:papers:2008.03600
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2008.03600
    File Function: Latest version
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Liangjun Su & Zhentao Shi & Peter C. B. Phillips, 2016. "Identifying Latent Structures in Panel Data," Econometrica, Econometric Society, vol. 84, pages 2215-2264, November.
    2. Lu, Xun & Su, Liangjun, 2016. "Shrinkage estimation of dynamic panel data models with interactive fixed effects," Journal of Econometrics, Elsevier, vol. 190(1), pages 148-175.
    3. Ghysels, Eric & Santa-Clara, Pedro & Valkanov, Rossen, 2006. "Predicting volatility: getting the most out of return data sampled at different frequencies," Journal of Econometrics, Elsevier, vol. 131(1-2), pages 59-95.
    4. Alexandre Belloni & Victor Chernozhukov & Christian Hansen & Damian Kozbur, 2016. "Inference in High-Dimensional Panel Models With an Application to Gun Control," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(4), pages 590-605, October.
    5. Michael W. McCracken & Serena Ng, 2016. "FRED-MD: A Monthly Database for Macroeconomic Research," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(4), pages 574-589, October.
    6. Victor Chernozhukov & Jerry Hausman & Whitney K. Newey, 2019. "Demand analysis with many prices," CeMMAP working papers CWP59/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    7. Elena Andreou & Eric Ghysels & Andros Kourtellos, 2013. "Should Macroeconomic Forecasters Use Daily Financial Data and How?," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 31(2), pages 240-251, April.
    8. Diebold, Francis X & Mariano, Roberto S, 2002. "Comparing Predictive Accuracy," Journal of Business & Economic Statistics, American Statistical Association, vol. 20(1), pages 134-144, January.
    9. Kock, Anders Bredahl, 2013. "Oracle Efficient Variable Selection In Random And Fixed Effects Panel Data Models," Econometric Theory, Cambridge University Press, vol. 29(1), pages 115-152, February.
    10. Andrii Babii, 2022. "High-Dimensional Mixed-Frequency IV Regression," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 40(4), pages 1470-1483, October.
    11. Farrell, Max H., 2015. "Robust inference on average treatment effects with possibly more covariates than observations," Journal of Econometrics, Elsevier, vol. 189(1), pages 1-23.
    12. Koenker, Roger, 2004. "Quantile regression for longitudinal data," Journal of Multivariate Analysis, Elsevier, vol. 91(1), pages 74-89, October.
    13. Chiang, Harold D. & Rodrigue, Joel & Sasaki, Yuya, 2023. "Post-Selection Inference In Three-Dimensional Panel Data," Econometric Theory, Cambridge University Press, vol. 39(3), pages 623-658, June.
    14. Harding, Matthew & Lamarche, Carlos, 2019. "A panel quantile approach to attrition bias in Big Data: Evidence from a randomized experiment," Journal of Econometrics, Elsevier, vol. 211(1), pages 61-82.
    15. Dedecker, Jérôme & Doukhan, Paul, 2003. "A new covariance inequality and applications," Stochastic Processes and their Applications, Elsevier, vol. 106(1), pages 63-80, July.
    16. Lamarche, Carlos, 2010. "Robust penalized quantile regression estimation for panel data," Journal of Econometrics, Elsevier, vol. 157(2), pages 396-408, August.
    17. Ryan T. Ball & Eric Ghysels, 2018. "Automated Earnings Forecasts: Beat Analysts or Combine and Conquer?," Management Science, INFORMS, vol. 64(10), pages 4936-4952, October.
    18. Andrii Babii & Eric Ghysels & Jonas Striaukas, 2019. "High-Dimensional Granger Causality Tests with an Application to VIX and News," Papers 1912.06307, arXiv.org, revised Feb 2021.
    19. Arellano, Manuel, 2003. "Panel Data Econometrics," OUP Catalogue, Oxford University Press, number 9780199245291.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ricardo P. Masini & Marcelo C. Medeiros & Eduardo F. Mendes, 2023. "Machine learning advances for time series forecasting," Journal of Economic Surveys, Wiley Blackwell, vol. 37(1), pages 76-111, February.
    2. Knut Are Aastveit & Tuva Marie Fastbø & Eleonora Granziera & Kenneth Sæterhagen Paulsen & Kjersti Næss Torstensen, 2020. "Nowcasting Norwegian household consumption with debit card transaction data," Working Paper 2020/17, Norges Bank.
    3. Hafner, Christian & Wang, Linqi, 2020. "Dynamic portfolio selection with sector-specific regularization," LIDAM Discussion Papers ISBA 2020032, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    4. Hans Genberg & Özer Karagedikli, 2021. "Machine Learning and Central Banks: Ready for Prime Time?," Working Papers wp43, South East Asian Central Banks (SEACEN) Research and Training Centre.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Andrii Babii & Eric Ghysels & Jonas Striaukas, 2022. "Machine Learning Time Series Regressions With an Application to Nowcasting," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 40(3), pages 1094-1106, June.
    2. Lamarche, Carlos & Parker, Thomas, 2023. "Wild bootstrap inference for penalized quantile regression for longitudinal data," Journal of Econometrics, Elsevier, vol. 235(2), pages 1799-1826.
    3. Andrii Babii & Eric Ghysels & Jonas Striaukas, 2019. "High-Dimensional Granger Causality Tests with an Application to VIX and News," Papers 1912.06307, arXiv.org, revised Feb 2021.
    4. Benedikt Maas, 2020. "Short‐term forecasting of the US unemployment rate," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 39(3), pages 394-411, April.
    5. Chiang, Harold D. & Rodrigue, Joel & Sasaki, Yuya, 2023. "Post-Selection Inference In Three-Dimensional Panel Data," Econometric Theory, Cambridge University Press, vol. 39(3), pages 623-658, June.
    6. João C. Claudio & Katja Heinisch & Oliver Holtemöller, 2020. "Nowcasting East German GDP growth: a MIDAS approach," Empirical Economics, Springer, vol. 58(1), pages 29-54, January.
    7. Andrii Babii & Ryan T. Ball & Eric Ghysels & Jonas Striaukas, 2023. "Panel Data Nowcasting: The Case of Price-Earnings Ratios," Papers 2307.02673, arXiv.org.
    8. Frantisek Cech & Jozef Barunik, 2017. "Measurement of Common Risk Factors: A Panel Quantile Regression Model for Returns," Working Papers IES 2017/20, Charles University Prague, Faculty of Social Sciences, Institute of Economic Studies, revised Sep 2017.
    9. Degiannakis, Stavros & Filis, George, 2018. "Forecasting oil prices: High-frequency financial data are indeed useful," Energy Economics, Elsevier, vol. 76(C), pages 388-402.
    10. Gonçalves, Sílvia & McCracken, Michael W. & Perron, Benoit, 2017. "Tests of equal accuracy for nested models with estimated factors," Journal of Econometrics, Elsevier, vol. 198(2), pages 231-252.
    11. Jiaying Gu & Stanislav Volgushev, 2018. "Panel Data Quantile Regression with Grouped Fixed Effects," Papers 1801.05041, arXiv.org, revised Aug 2018.
    12. Stavros Degiannakis, 2023. "The D-model for GDP nowcasting," Swiss Journal of Economics and Statistics, Springer;Swiss Society of Economics and Statistics, vol. 159(1), pages 1-33, December.
    13. Knut Are Aastveit & Claudia Foroni & Francesco Ravazzolo, 2017. "Density Forecasts With Midas Models," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 32(4), pages 783-801, June.
    14. Roccazzella, Francesco & Gambetti, Paolo & Vrins, Frédéric, 2022. "Optimal and robust combination of forecasts via constrained optimization and shrinkage," International Journal of Forecasting, Elsevier, vol. 38(1), pages 97-116.
    15. Sarun Kamolthip, 2021. "Macroeconomic Forecasting with LSTM and Mixed Frequency Time Series Data," PIER Discussion Papers 165, Puey Ungphakorn Institute for Economic Research.
    16. Zhang, Yue-Jun & Wang, Jin-Li, 2019. "Do high-frequency stock market data help forecast crude oil prices? Evidence from the MIDAS models," Energy Economics, Elsevier, vol. 78(C), pages 192-201.
    17. Knotek, Edward S. & Zaman, Saeed, 2019. "Financial nowcasts and their usefulness in macroeconomic forecasting," International Journal of Forecasting, Elsevier, vol. 35(4), pages 1708-1724.
    18. Zhang, Yingying & Wang, Huixia Judy & Zhu, Zhongyi, 2019. "Quantile-regression-based clustering for panel data," Journal of Econometrics, Elsevier, vol. 213(1), pages 54-67.
    19. Andrii Babii & Jean-Pierre Florens, 2017. "Is completeness necessary? Estimation in nonidentified linear models," Papers 1709.03473, arXiv.org, revised Nov 2021.
    20. Giovanni Ballarin & Petros Dellaportas & Lyudmila Grigoryeva & Marcel Hirt & Sophie van Huellen & Juan-Pablo Ortega, 2022. "Reservoir Computing for Macroeconomic Forecasting with Mixed Frequency Data," Papers 2211.00363, arXiv.org, revised Jan 2024.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2008.03600. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.