Generalization error minimization: a new approach to model evaluation and selection with an application to penalized regression

My bibliography Save this paper

Generalization error minimization: a new approach to model evaluation and selection with an application to penalized regression

Author

Listed:

Ning Xu
Jian Hong
Timothy C. G. Fisher

Registered:

Timothy C. G. Fisher

Abstract

We study model evaluation and model selection from the perspective of generalization ability (GA): the ability of a model to predict outcomes in new samples from the same population. We believe that GA is one way formally to address concerns about the external validity of a model. The GA of a model estimated on a sample can be measured by its empirical out-of-sample errors, called the generalization errors (GE). We derive upper bounds for the GE, which depend on sample sizes, model complexity and the distribution of the loss function. The upper bounds can be used to evaluate the GA of a model, ex ante. We propose using generalization error minimization (GEM) as a framework for model selection. Using GEM, we are able to unify a big class of penalized regression estimators, including lasso, ridge and bridge, under the same set of assumptions. We establish finite-sample and asymptotic properties (including $\mathcal{L}_2$-consistency) of the GEM estimator for both the $n \geqslant p$ and the $n

Suggested Citation

Ning Xu & Jian Hong & Timothy C. G. Fisher, 2016. "Generalization error minimization: a new approach to model evaluation and selection with an application to penalized regression," Papers 1610.05448, arXiv.org.

Handle: RePEc:arx:papers:1610.05448

Download full text from publisher

References listed on IDEAS

James J. Heckman & Vytlacil, Edward J., 2007. "Econometric Evaluation of Social Programs, Part II: Using the Marginal Treatment Effect to Organize Alternative Econometric Estimators to Evaluate Social Programs, and to Forecast their Effects in New," Handbook of Econometrics, in: J.J. Heckman & E.E. Leamer (ed.), Handbook of Econometrics, edition 1, volume 6, chapter 71, Elsevier.
Francesco Guala & Luigi Mittone, 2005. "Experiments in economics: External validity and the robustness of phenomena," Journal of Economic Methodology, Taylor & Francis Journals, vol. 12(4), pages 495-515.
Peter Hall & Jeff Racine & Qi Li, 2004. "Cross-Validation and the Estimation of Conditional Probability Densities," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 1015-1026, December.
repec:feb:artefa:0110 is not listed on IDEAS
John A. List, 2011. "Why Economists Should Conduct Field Experiments and 14 Tips for Pulling One Off," Journal of Economic Perspectives, American Economic Association, vol. 25(3), pages 3-16, Summer.
- John List, 2011. "Why economists should conduct field experiments and 14 tips for pulling one off," Artefactual Field Experiments 00089, The Field Experiments Website.
Hal R. Varian, 2014. "Big Data: New Tricks for Econometrics," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 3-28, Spring.
Zou, Hui, 2006. "The Adaptive Lasso and Its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1418-1429, December.
Jens Ludwig & Jeffrey R. Kling & Sendhil Mullainathan, 2011. "Mechanism Experiments and Policy Evaluations," Journal of Economic Perspectives, American Economic Association, vol. 25(3), pages 17-38, Summer.
- Jens Ludwig & Jeffrey R. Kling & Sendhil Mullainathan, 2011. "Mechanism Experiments and Policy Evaluations," NBER Working Papers 17062, National Bureau of Economic Research, Inc.
Brian E. Roe & David R. Just, 2009. "Internal and External Validity in Economics Research: Tradeoffs between Experiments, Field Experiments, Natural Experiments, and Field Data," American Journal of Agricultural Economics, Agricultural and Applied Economics Association, vol. 91(5), pages 1266-1271.
Caner, Mehmet, 2009. "Lasso-Type Gmm Estimator," Econometric Theory, Cambridge University Press, vol. 25(1), pages 270-290, February.
Joshua Angrist & Ivan Fernandez-Val, 2010. "ExtrapoLATE-ing: External Validity and Overidentification in the LATE Framework," NBER Working Papers 16566, National Bureau of Economic Research, Inc.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Ning Xu & Jian Hong & Timothy C. G. Fisher, 2016. "Model selection consistency from the perspective of generalization ability and VC theory with an application to Lasso," Papers 1606.00142, arXiv.org.
- Xu, Ning & Hong, Jian & Fisher, Timothy, 2016. "Model selection consistency from the perspective of generalization ability and VC theory with an application to Lasso," MPRA Paper 71670, University Library of Munich, Germany.
Muller, Sean, 2014. "Randomised trials for policy: a review of the external validity of treatment effects," SALDRU Working Papers 127, Southern Africa Labour and Development Research Unit, University of Cape Town.
Caner, Mehmet & Fan, Qingliang, 2015. "Hybrid generalized empirical likelihood estimators: Instrument selection with adaptive lasso," Journal of Econometrics, Elsevier, vol. 187(1), pages 256-274.
Ning Xu & Jian Hong & Timothy C. G. Fisher, 2016. "Finite-sample and asymptotic analysis of generalization ability with an application to penalized regression," Papers 1609.03344, arXiv.org, revised Sep 2016.
- Xu, Ning & Hong, Jian & Fisher, Timothy, 2016. "Finite-sample and asymptotic analysis of generalization ability with an application to penalized regression," MPRA Paper 73622, University Library of Munich, Germany.
Costa, Alexandre Bonnet R. & Ferreira, Pedro Cavalcanti G. & Gaglianone, Wagner P. & Guillén, Osmani Teixeira C. & Issler, João Victor & Lin, Yihao, 2021. "Machine learning and oil price point and density forecasting," Energy Economics, Elsevier, vol. 102(C).
- Alexandre Bonnet R. Costa & Pedro Cavalcanti G. Ferreira & Wagner P. Gaglianone & Osmani Teixeira C. Guillén & João Victor Issler & Yihao Lin, 2021. "Machine Learning and Oil Price Point and Density Forecasting," Working Papers Series 544, Central Bank of Brazil, Research Department.
Götz, Thomas B. & Knetsch, Thomas A., 2019. "Google data in bridge equation models for German GDP," International Journal of Forecasting, Elsevier, vol. 35(1), pages 45-66.
- Götz, Thomas B. & Knetsch, Thomas A., 2017. "Google data in bridge equation models for German GDP," Discussion Papers 18/2017, Deutsche Bundesbank.
Lee, Ji Hyung & Shi, Zhentao & Gao, Zhan, 2022. "On LASSO for predictive regression," Journal of Econometrics, Elsevier, vol. 229(2), pages 322-349.
- Ji Hyung Lee & Zhentao Shi & Zhan Gao, 2018. "On LASSO for Predictive Regression," Papers 1810.03140, arXiv.org, revised Feb 2021.
Mark F. J. Steel, 2020. "Model Averaging and Its Use in Economics," Journal of Economic Literature, American Economic Association, vol. 58(3), pages 644-719, September.
- Steel, Mark F. J., 2017. "Model Averaging and its Use in Economics," MPRA Paper 81568, University Library of Munich, Germany.
- Steel, Mark F. J., 2017. "Model Averaging and its Use in Economics," MPRA Paper 90110, University Library of Munich, Germany, revised 16 Nov 2018.
Mona Aghdaee & Bonny Parkinson & Kompal Sinha & Yuanyuan Gu & Rajan Sharma & Emma Olin & Henry Cutler, 2022. "An examination of machine learning to map non‐preference based patient reported outcome measures to health state utility values," Health Economics, John Wiley & Sons, Ltd., vol. 31(8), pages 1525-1557, August.
Achim Ahrens & Christian B. Hansen & Mark E. Schaffer, 2020. "lassopack: Model selection and prediction with regularized regression in Stata," Stata Journal, StataCorp LP, vol. 20(1), pages 176-235, March.
- Ahrens, Achim & Hansen, Christian B. & Schaffer, Mark E, 2019. "lassopack: Model Selection and Prediction with Regularized Regression in Stata," IZA Discussion Papers 12081, Institute of Labor Economics (IZA).
- Achim Ahrens & Christian B. Hansen & Mark E. Schaffer, 2019. "lassopack: Model selection and prediction with regularized regression in Stata," Papers 1901.05397, arXiv.org.
Fan, Jianqing & Liao, Yuan, 2012. "Endogeneity in ultrahigh dimension," MPRA Paper 38698, University Library of Munich, Germany.
Susan Athey & Guido W. Imbens, 2017. "The State of Applied Econometrics: Causality and Policy Evaluation," Journal of Economic Perspectives, American Economic Association, vol. 31(2), pages 3-32, Spring.
- Susan Athey & Guido Imbens, 2016. "The State of Applied Econometrics - Causality and Policy Evaluation," Papers 1607.00699, arXiv.org.
Xu Cheng & Zhipeng Liao, 2012. "Select the Valid and Relevant Moments: A One-Step Procedure for GMM with Many Moments," PIER Working Paper Archive 12-045, Penn Institute for Economic Research, Department of Economics, University of Pennsylvania.
Byron Botha & Rulof Burger & Kevin Kotzé & Neil Rankin & Daan Steenkamp, 2023. "Big data forecasting of South African inflation," Empirical Economics, Springer, vol. 65(1), pages 149-188, July.
- Byron Botha & Rulof Burger & Kevin Kotze & Neil Rankin & Daan Steenkamp, 2022. "Big data forecasting of South African inflation," School of Economics Macroeconomic Discussion Paper Series 2022-03, School of Economics, University of Cape Town.
- Byron Botha & Kevin Kotze & Neil Rankin & Rulof P. Burger, 2022. "Big data forecasting of South African inflation," Working Papers 873, Economic Research Southern Africa.
- Byron Botha & Rulof Burger & Kevin Kotz & Neil Rankin & Daan Steenkamp, 2022. "Big data forecasting of South African inflation," Working Papers 11022, South African Reserve Bank.
Belot, Michèle & James, Jonathan, 2016. "Partner selection into policy relevant field experiments," Journal of Economic Behavior & Organization, Elsevier, vol. 123(C), pages 31-56.
- Michele Belot & Jonathan James, 2013. "Partner Selection into Policy Relevant Field Experiments," Edinburgh School of Economics Discussion Paper Series 236, Edinburgh School of Economics, University of Edinburgh.
- Belot, Michele & James, Jonathan, 2013. "Partner Selection into Policy Relevant Field Experiments," SIRE Discussion Papers 2013-112, Scottish Institute for Research in Economics (SIRE).
Ivan Korolev, 2018. "LM-BIC Model Selection in Semiparametric Models," Papers 1811.10676, arXiv.org.
Carroll, Kathryn A. & Samek, Anya, 2018. "Field experiments on food choice in grocery stores: A ‘how-to’ guide," Food Policy, Elsevier, vol. 79(C), pages 331-340.
Elena Ivona DUMITRESCU & Sullivan HUE & Christophe HURLIN & Sessi TOKPAVI, 2020. "Machine Learning or Econometrics for Credit Scoring: Let’s Get the Best of Both Worlds," LEO Working Papers / DR LEO 2839, Orleans Economics Laboratory / Laboratoire d'Economie d'Orleans (LEO), University of Orleans.
- Elena Dumitrescu & Sullivan Hué & Christophe Hurlin & Sessi Tokpavi, 2021. "Machine Learning or Econometrics for Credit Scoring: Let's Get the Best of Both Worlds," Working Papers hal-02507499, HAL.
Michael C. Knaus & Michael Lechner & Anthony Strittmatter, 2022. "Heterogeneous Employment Effects of Job Search Programs: A Machine Learning Approach," Journal of Human Resources, University of Wisconsin Press, vol. 57(2), pages 597-636.
- Knaus, Michael C. & Lechner, Michael & Strittmatter, Anthony, 2017. "Heterogeneous Employment Effects of Job Search Programmes: A Machine Learning Approach," Economics Working Paper Series 1711, University of St. Gallen, School of Economics and Political Science.
- Michael Knaus & Michael Lechner & Anthony Strittmatter, 2017. "Heterogeneous Employment Effects of Job Search Programmes: A Machine Learning Approach," Papers 1709.10279, arXiv.org, revised May 2018.
- Lechner, Michael & Strittmatter, Anthony & Knaus, Michael C., 2017. "Heterogeneous Employment Effects of Job Search Programmes: A Machine Learning Approach," CEPR Discussion Papers 12224, C.E.P.R. Discussion Papers.
- Knaus, Michael C. & Lechner, Michael & Strittmatter, Anthony, 2017. "Heterogeneous Employment Effects of Job Search Programmes: A Machine Learning Approach," IZA Discussion Papers 10961, Institute of Labor Economics (IZA).
Liesbeth Colen & Sergio Gomez y Paloma & Uwe Latacz-Lohmann & Marianne Lefebvre & Raphaële Préget & Sophie Thoyer, 2016. "Economic Experiments as a Tool for Agricultural Policy Evaluation: Insights from the European CAP," Canadian Journal of Agricultural Economics/Revue canadienne d'agroeconomie, Canadian Agricultural Economics Society/Societe canadienne d'agroeconomie, vol. 64(4), pages 667-694, December.
- Liesbeth Colen & Sergio Gomez-Y-Paloma & Uwe Latacz-Lohmann & Marianne Lefebvre & Raphaële Préget & Sophie Thoyer, 2015. "Economic experiments as a tool for agricultural policy evaluation: insights from the European CAP," Post-Print hal-02743124, HAL.
- Liesbeth Colen & Sergio Gomez y Paloma & Uwe Latacz-Lohmann & Marianne Lefebvre & Raphaële Préget & Sophie Thoyer, 2016. "Economic Experiments as a Tool for Agricultural Policy Evaluation: Insights from the European CAP," Post-Print hal-01416121, HAL.

More about this item

NEP fields

This paper has been announced in the following NEP Reports:

NEP-ECM-2016-10-23 (Econometrics)

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:1610.05448. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Generalization error minimization: a new approach to model evaluation and selection with an application to penalized regression

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

NEP fields

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data