How To Pick The Best Regression Equation: A Review And Comparison Of Model Selection Algorithms
This paper reviews and compares twenty-one different model selection algorithms (MSAs) representing a diversity of approaches, including (i) information criteria such as AIC and SIC; (ii) selection of a “portfolio” or best subset of models; (iii) general-to-specific algorithms, (iv) forward-stepwise regression approaches; (v) Bayesian Model Averaging; and (vi) inclusion of all variables. We use coefficient unconditional mean-squared error (UMSE) as the basis for our measure of MSA performance. Our main goal is to identify the factors that determine MSA performance. Towards this end, we conduct Monte Carlo experiments across a variety of data environments. Our experiments show that MSAs differ substantially with respect to their performance on relevant and irrelevant variables. We relate this to their associated penalty functions, and a bias-variance tradeoff in coefficient estimates. It follows that no MSA will dominate under all conditions. However, when we restrict our analysis to conditions where automatic variable selection is likely to be of greatest value, we find that two general-to-specific MSAs, Autometrics, do as well or better than all others in over 90% of the experiments.
|Date of creation:||01 Oct 2009|
|Date of revision:|
|Contact details of provider:|| Postal: Private Bag 4800, Christchurch, New Zealand|
Phone: 64 3 369 3123 (Administrator)
Fax: 64 3 364 2635
Web page: http://www.econ.canterbury.ac.nz
More information through EDIRC
When requesting a correction, please mention this item's handle: RePEc:cbt:econwp:09/13. See general information about how to correct material in RePEc.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Albert Yee)
If references are entirely missing, you can add them using this form.