IDEAS home Printed from https://ideas.repec.org/p/hal/journl/halshs-00917797.html
   My bibliography  Save this paper

Variable selection and forecasting via automated methods for linear models: LASSO/adaLASSO and Autometrics

Author

Listed:
  • Camila Epprecht

    (CES - Centre d'économie de la Sorbonne - UP1 - Université Paris 1 Panthéon-Sorbonne - CNRS - Centre National de la Recherche Scientifique, PUC-Rio - Pontifícia Universidade Católica do Rio de Janeiro [Brasil] = Pontifical Catholic University of Rio de Janeiro [Brazil] = Université catholique pontificale de Rio de Janeiro [Brésil])

  • Dominique Guegan

    (CES - Centre d'économie de la Sorbonne - UP1 - Université Paris 1 Panthéon-Sorbonne - CNRS - Centre National de la Recherche Scientifique)

  • Álvaro Veiga

    (PUC-Rio - Pontifícia Universidade Católica do Rio de Janeiro [Brasil] = Pontifical Catholic University of Rio de Janeiro [Brazil] = Université catholique pontificale de Rio de Janeiro [Brésil])

  • Joel Correa da Rosa

    (MSSM - Icahn School of Medicine at Mount Sinai [New York])

Abstract

In this paper we compare two approaches of model selection methods for linear regression models: classical approach - Autometrics (automatic general-to-specific selection) — and statistical learning - LASSO (ℓ1-norm regularization) and adaLASSO (adaptive LASSO). In a simulation experiment, considering a simple setup with orthogonal candidate variables and independent data, we compare the performance of the methods concerning predictive power (out-of-sample forecast), selection of the correct model (variable selection) and parameter estimation. The case where the number of candidate variables exceeds the number of observation is considered as well. Finally, in an application using genomic data from a highthroughput experiment we compare the predictive power of the methods to predict epidermal thickness in psoriatic patients.

Suggested Citation

  • Camila Epprecht & Dominique Guegan & Álvaro Veiga & Joel Correa da Rosa, 2017. "Variable selection and forecasting via automated methods for linear models: LASSO/adaLASSO and Autometrics," Post-Print halshs-00917797, HAL.
  • Handle: RePEc:hal:journl:halshs-00917797
    Note: View the original document on HAL open archive server: https://shs.hal.science/halshs-00917797v2
    as

    Download full text from publisher

    File URL: https://shs.hal.science/halshs-00917797v2/document
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Krolzig, Hans-Martin & Hendry, David F., 2001. "Computer automation of general-to-specific model selection procedures," Journal of Economic Dynamics and Control, Elsevier, vol. 25(6-7), pages 831-866, June.
    2. Teodosio Perez‐Amaral & Giampiero M. Gallo & Halbert White, 2003. "A Flexible Tool for Model Building: the Relevant Transformation of the Inputs Network Approach (RETINA)," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 65(s1), pages 821-838, December.
    3. Robert Tibshirani, 2011. "Regression shrinkage and selection via the lasso: a retrospective," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 73(3), pages 273-282, June.
    4. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    5. Hansheng Wang & Guodong Li & Chih‐Ling Tsai, 2007. "Regression coefficient and autoregressive order shrinkage and selection via the lasso," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 69(1), pages 63-78, February.
    6. Julia Campos & David F. Hendry & Hans‐Martin Krolzig, 2003. "Consistent Model Selection by an Automatic Gets Approach," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 65(s1), pages 803-819, December.
    7. Suyan Tian & Mayte Suárez-Fariñas, 2013. "Multi-TGDR: A Regularization Method for Multi-Class Classification in Microarray Experiments," PLOS ONE, Public Library of Science, vol. 8(11), pages 1-12, November.
    8. Zou, Hui, 2006. "The Adaptive Lasso and Its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1418-1429, December.
    9. David F. Hendry & Bent Nielsen, 2007. "Preface to Econometric Modeling: A Likelihood Approach," Introductory Chapters, in: Econometric Modeling: A Likelihood Approach, Princeton University Press.
    10. Friedman, Jerome H. & Hastie, Trevor & Tibshirani, Rob, 2010. "Regularization Paths for Generalized Linear Models via Coordinate Descent," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i01).
    11. Zhang, Yiyun & Li, Runze & Tsai, Chih-Ling, 2010. "Regularization Parameter Selections via Generalized Information Criterion," Journal of the American Statistical Association, American Statistical Association, vol. 105(489), pages 312-323.
    12. David F. Hendry & Bent Nielsen, 2007. "The Bernoulli model, from Econometric Modeling: A Likelihood Approach," Introductory Chapters, in: Econometric Modeling: A Likelihood Approach, Princeton University Press.
    13. Suyan Tian & James G Krueger & Katherine Li & Ali Jabbari & Carrie Brodmerkel & Michelle A Lowes & Mayte Suárez-Fariñas, 2012. "Meta-Analysis Derived (MAD) Transcriptome of Psoriasis Defines the “Core” Pathogenesis of Disease," PLOS ONE, Public Library of Science, vol. 7(9), pages 1-15, September.
    14. Harvey, David & Leybourne, Stephen & Newbold, Paul, 1997. "Testing the equality of prediction mean squared errors," International Journal of Forecasting, Elsevier, vol. 13(2), pages 281-291, June.
    15. Hui Zou & Trevor Hastie, 2005. "Addendum: Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(5), pages 768-768, November.
    16. Hui Zou & Trevor Hastie, 2005. "Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 301-320, April.
    17. White, Halbert, 2006. "Approximate Nonlinear Forecasting Methods," Handbook of Economic Forecasting, in: G. Elliott & C. Granger & A. Timmermann (ed.), Handbook of Economic Forecasting, edition 1, volume 1, chapter 9, pages 459-512, Elsevier.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Loann David Denis Desboulets, 2018. "A Review on Variable Selection in Regression Analysis," Econometrics, MDPI, vol. 6(4), pages 1-27, November.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Camila Epprecht & Dominique Guegan & Álvaro Veiga & Joel Correa da Rosa, 2017. "Variable selection and forecasting via automated methods for linear models: LASSO/adaLASSO and Autometrics," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) halshs-00917797, HAL.
    2. Camila Epprecht & Dominique Guegan & Álvaro Veiga, 2013. "Comparing variable selection techniques for linear regression: LASSO and Autometrics," Documents de travail du Centre d'Economie de la Sorbonne 13080, Université Panthéon-Sorbonne (Paris 1), Centre d'Economie de la Sorbonne.
    3. Peter Martey Addo & Dominique Guegan & Bertrand Hassani, 2018. "Credit Risk Analysis Using Machine and Deep Learning Models," Risks, MDPI, vol. 6(2), pages 1-20, April.
    4. Ricardo P. Masini & Marcelo C. Medeiros & Eduardo F. Mendes, 2023. "Machine learning advances for time series forecasting," Journal of Economic Surveys, Wiley Blackwell, vol. 37(1), pages 76-111, February.
    5. Tanin Sirimongkolkasem & Reza Drikvandi, 2019. "On Regularisation Methods for Analysis of High Dimensional Data," Annals of Data Science, Springer, vol. 6(4), pages 737-763, December.
    6. van Erp, Sara & Oberski, Daniel L. & Mulder, Joris, 2018. "Shrinkage priors for Bayesian penalized regression," OSF Preprints cg8fq, Center for Open Science.
    7. Sierra A. Bainter & Thomas G. McCauley & Mahmoud M. Fahmy & Zachary T. Goodman & Lauren B. Kupis & J. Sunil Rao, 2023. "Comparing Bayesian Variable Selection to Lasso Approaches for Applications in Psychology," Psychometrika, Springer;The Psychometric Society, vol. 88(3), pages 1032-1055, September.
    8. Zhixuan Fu & Chirag R. Parikh & Bingqing Zhou, 2017. "Penalized variable selection in competing risks regression," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 23(3), pages 353-376, July.
    9. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    10. Peter Bühlmann & Jacopo Mandozzi, 2014. "High-dimensional variable screening and bias in subsequent inference, with an empirical comparison," Computational Statistics, Springer, vol. 29(3), pages 407-430, June.
    11. Capanu, Marinela & Giurcanu, Mihai & Begg, Colin B. & Gönen, Mithat, 2023. "Subsampling based variable selection for generalized linear models," Computational Statistics & Data Analysis, Elsevier, vol. 184(C).
    12. Loann David Denis Desboulets, 2018. "A Review on Variable Selection in Regression Analysis," Econometrics, MDPI, vol. 6(4), pages 1-27, November.
    13. Zeyu Bian & Erica E. M. Moodie & Susan M. Shortreed & Sahir Bhatnagar, 2023. "Variable selection in regression‐based estimation of dynamic treatment regimes," Biometrics, The International Biometric Society, vol. 79(2), pages 988-999, June.
    14. Jingxuan Luo & Lili Yue & Gaorong Li, 2023. "Overview of High-Dimensional Measurement Error Regression Models," Mathematics, MDPI, vol. 11(14), pages 1-22, July.
    15. Zhang, Tonglin, 2024. "Variables selection using L0 penalty," Computational Statistics & Data Analysis, Elsevier, vol. 190(C).
    16. Zanhua Yin, 2020. "Variable selection for sparse logistic regression," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 83(7), pages 821-836, October.
    17. Xing, Li-Min & Zhang, Yue-Jun, 2022. "Forecasting crude oil prices with shrinkage methods: Can nonconvex penalty and Huber loss help?," Energy Economics, Elsevier, vol. 110(C).
    18. Florian Ziel, 2015. "Iteratively reweighted adaptive lasso for conditional heteroscedastic time series with applications to AR-ARCH type processes," Papers 1502.06557, arXiv.org, revised Dec 2015.
    19. Marcelo C. Medeiros & Eduardo F. Mendes, 2015. "l1-Regularization of High-Dimensional Time-Series Models with Flexible Innovations," Textos para discussão 636, Department of Economics PUC-Rio (Brazil).
    20. Dumitrescu, Elena & Hué, Sullivan & Hurlin, Christophe & Tokpavi, Sessi, 2022. "Machine learning for credit scoring: Improving logistic regression with non-linear decision-tree effects," European Journal of Operational Research, Elsevier, vol. 297(3), pages 1178-1192.

    More about this item

    Keywords

    model selection; general-to-specific; adaptive LASSO; sparse models; Monte Carlo simulation; genetic data;
    All these keywords.

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:hal:journl:halshs-00917797. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: CCSD (email available below). General contact details of provider: https://hal.archives-ouvertes.fr/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.