IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v156y2021ics0167947320302218.html
   My bibliography  Save this article

A nonparametric empirical Bayes approach to large-scale multivariate regression

Author

Listed:
  • Wang, Yihe
  • Zhao, Sihai Dave

Abstract

Multivariate regression has many applications, ranging from time series prediction to genomics. Borrowing information across the outcomes can improve prediction error, even when outcomes are statistically independent. Many methods exist to implement this strategy, for example the multiresponse lasso, but choosing the optimal method for a given dataset is difficult. These issues are addressed by establishing a connection between multivariate linear regression and compound decision problems. A nonparametric empirical Bayes procedure that can learn the optimal regression method from the data itself is proposed. Furthermore, the proposed procedure is free of tuning parameters and performs well in simulations and in a multiple stock price prediction problem.

Suggested Citation

  • Wang, Yihe & Zhao, Sihai Dave, 2021. "A nonparametric empirical Bayes approach to large-scale multivariate regression," Computational Statistics & Data Analysis, Elsevier, vol. 156(C).
  • Handle: RePEc:eee:csdana:v:156:y:2021:i:c:s0167947320302218
    DOI: 10.1016/j.csda.2020.107130
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947320302218
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2020.107130?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Feng, Long & Dicker, Lee H., 2018. "Approximate nonparametric maximum likelihood for mixture models: A convex optimization approach to fitting arbitrary multivariate mixing distributions," Computational Statistics & Data Analysis, Elsevier, vol. 122(C), pages 80-91.
    2. Jeffrey M Wooldridge, 2010. "Econometric Analysis of Cross Section and Panel Data," MIT Press Books, The MIT Press, edition 2, volume 1, number 0262232588, December.
    3. Friedman, Jerome H. & Hastie, Trevor & Tibshirani, Rob, 2010. "Regularization Paths for Generalized Linear Models via Coordinate Descent," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i01).
    4. Yanming Li & Bin Nan & Ji Zhu, 2015. "Multivariate sparse group lasso for the multivariate multiple linear regression with an arbitrary group structure," Biometrics, The International Biometric Society, vol. 71(2), pages 354-363, June.
    5. Roger Koenker & Ivan Mizera, 2014. "Convex Optimization, Shape Constraints, Compound Decisions, and Empirical Bayes Rules," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(506), pages 674-685, June.
    6. Leo Breiman & Jerome H. Friedman, 1997. "Predicting Multivariate Responses in Multiple Linear Regression," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 59(1), pages 3-54.
    7. Kelly, Bryan & Pruitt, Seth, 2015. "The three-pass regression filter: A new approach to forecasting using many predictors," Journal of Econometrics, Elsevier, vol. 186(2), pages 294-316.
    8. Makridakis, Spyros & Spiliotis, Evangelos & Assimakopoulos, Vassilios, 2018. "The M4 Competition: Results, findings, conclusion and way forward," International Journal of Forecasting, Elsevier, vol. 34(4), pages 802-808.
    9. Tingni Sun & Cun-Hui Zhang, 2012. "Scaled sparse linear regression," Biometrika, Biometrika Trust, vol. 99(4), pages 879-898.
    10. Koenker, Roger & Mizera, Ivan, 2014. "Convex Optimization in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 60(i05).
    11. Jianqing Fan & Jinchi Lv, 2008. "Sure independence screening for ultrahigh dimensional feature space," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(5), pages 849-911, November.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Bai, Ray & Ghosh, Malay, 2018. "High-dimensional multivariate posterior consistency under global–local shrinkage priors," Journal of Multivariate Analysis, Elsevier, vol. 167(C), pages 157-170.
    2. Peter Bühlmann & Jacopo Mandozzi, 2014. "High-dimensional variable screening and bias in subsequent inference, with an empirical comparison," Computational Statistics, Springer, vol. 29(3), pages 407-430, June.
    3. Xin Wang & Lingchen Kong & Liqun Wang, 2022. "Estimation of Error Variance in Regularized Regression Models via Adaptive Lasso," Mathematics, MDPI, vol. 10(11), pages 1-19, June.
    4. Adel Javanmard & Jason D. Lee, 2020. "A flexible framework for hypothesis testing in high dimensions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 82(3), pages 685-718, July.
    5. Jun Li & Serguei Netessine & Sergei Koulayev, 2018. "Price to Compete … with Many: How to Identify Price Competition in High-Dimensional Space," Management Science, INFORMS, vol. 64(9), pages 4118-4136, September.
    6. Hewamalage, Hansika & Bergmeir, Christoph & Bandara, Kasun, 2021. "Recurrent Neural Networks for Time Series Forecasting: Current status and future directions," International Journal of Forecasting, Elsevier, vol. 37(1), pages 388-427.
    7. Ahmed Ismaïl & Hartikainen Anna-Liisa & Järvelin Marjo-Riitta & Richardson Sylvia, 2011. "False Discovery Rate Estimation for Stability Selection: Application to Genome-Wide Association Studies," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 10(1), pages 1-20, November.
    8. Meira, Erick & Cyrino Oliveira, Fernando Luiz & de Menezes, Lilian M., 2022. "Forecasting natural gas consumption using Bagging and modified regularization techniques," Energy Economics, Elsevier, vol. 106(C).
    9. Loann David Denis Desboulets, 2018. "A Review on Variable Selection in Regression Analysis," Econometrics, MDPI, vol. 6(4), pages 1-27, November.
    10. Li, Peili & Jiao, Yuling & Lu, Xiliang & Kang, Lican, 2022. "A data-driven line search rule for support recovery in high-dimensional data analysis," Computational Statistics & Data Analysis, Elsevier, vol. 174(C).
    11. Jianqing Fan & Yang Feng & Jiancheng Jiang & Xin Tong, 2016. "Feature Augmentation via Nonparametrics and Selection (FANS) in High-Dimensional Classification," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(513), pages 275-287, March.
    12. Victor Chernozhukov & Christian Hansen & Yuan Liao, 2015. "A lava attack on the recovery of sums of dense and sparse signals," CeMMAP working papers CWP56/15, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    13. Seokhyun Chung & Raed Al Kontar & Zhenke Wu, 2022. "Weakly Supervised Multi-output Regression via Correlated Gaussian Processes," INFORMS Joural on Data Science, INFORMS, vol. 1(2), pages 115-137, October.
    14. Jingxuan Luo & Lili Yue & Gaorong Li, 2023. "Overview of High-Dimensional Measurement Error Regression Models," Mathematics, MDPI, vol. 11(14), pages 1-22, July.
    15. Zeng, Yaohui & Yang, Tianbao & Breheny, Patrick, 2021. "Hybrid safe–strong rules for efficient optimization in lasso-type problems," Computational Statistics & Data Analysis, Elsevier, vol. 153(C).
    16. Kimia Keshanian & Daniel Zantedeschi & Kaushik Dutta, 2022. "Features Selection as a Nash-Bargaining Solution: Applications in Online Advertising and Information Systems," INFORMS Journal on Computing, INFORMS, vol. 34(5), pages 2485-2501, September.
    17. Alexandre Belloni & Victor Chernozhukov & Lie Wang, 2013. "Pivotal estimation via square-root lasso in nonparametric regression," CeMMAP working papers CWP62/13, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    18. Malene Kallestrup-Lamb & Anders Bredahl Kock & Johannes Tang Kristensen, 2016. "Lassoing the Determinants of Retirement," Econometric Reviews, Taylor & Francis Journals, vol. 35(8-10), pages 1522-1561, December.
    19. Timothy B. Armstrong & Michal Kolesár & Mikkel Plagborg‐Møller, 2022. "Robust Empirical Bayes Confidence Intervals," Econometrica, Econometric Society, vol. 90(6), pages 2567-2602, November.
    20. Knippenberg, Erwin & Jensen, Nathaniel & Constas, Mark, 2019. "Quantifying household resilience with high frequency data: Temporal dynamics and methodological options," World Development, Elsevier, vol. 121(C), pages 1-15.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:156:y:2021:i:c:s0167947320302218. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.