IDEAS home Printed from https://ideas.repec.org/
MyIDEAS: Log in (now much improved!) to save this article

Model selection and model averaging after multiple imputation

Listed author(s):
  • Schomaker, Michael
  • Heumann, Christian
Registered author(s):

    Model selection and model averaging are two important techniques to obtain practical and useful models in applied research. However, it is now well-known that many complex issues arise, especially in the context of model selection, when the stochastic nature of the selection process is ignored and estimates, standard errors, and confidence intervals are calculated as if the selected model was known a priori. While model averaging aims to incorporate the uncertainty associated with the model selection process by combining estimates over a set of models, there is still some debate over appropriate interpretation and confidence interval construction. These problems become even more complex in the presence of missing data and it is currently not entirely clear how to proceed. To deal with such situations, a framework for model selection and model averaging in the context of missing data is proposed. The focus lies on multiple imputation as a strategy to deal with the missingness: a consequent combination with model averaging aims to incorporate both the uncertainty associated with the model selection and with the imputation process. Furthermore, the performance of bootstrapping as a flexible extension to our framework is evaluated. Monte Carlo simulations are used to reveal the nature of the proposed estimators in the context of the linear regression model. The practical implications of our approach are illustrated by means of a recent survival study on sputum culture conversion in pulmonary tuberculosis.

    If you experience problems downloading a file, check if you have the proper application to view it first. In case of further problems read the IDEAS help page. Note that these files are not on the IDEAS site. Please be patient as the files may be large.

    File URL: http://www.sciencedirect.com/science/article/pii/S016794731300073X
    Download Restriction: Full text for ScienceDirect subscribers only.

    As the access to this document is restricted, you may want to look for a different version under "Related research" (further below) or search for a different version of it.

    Article provided by Elsevier in its journal Computational Statistics & Data Analysis.

    Volume (Year): 71 (2014)
    Issue (Month): C ()
    Pages: 758-770

    as
    in new window

    Handle: RePEc:eee:csdana:v:71:y:2014:i:c:p:758-770
    DOI: 10.1016/j.csda.2013.02.017
    Contact details of provider: Web page: http://www.elsevier.com/locate/csda

    References listed on IDEAS
    Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.:

    as
    in new window


    1. Yan, Jun, 2007. "Enjoy the Joy of Copulas: With a Package copula," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 21(i04).
    2. Magnus, Jan R. & Powell, Owen & Prüfer, Patricia, 2010. "A comparison of two model averaging techniques with an application to growth empirics," Journal of Econometrics, Elsevier, vol. 154(2), pages 139-153, February.
    3. Horton, Nicholas J. & Kleinman, Ken P., 2007. "Much Ado About Nothing: A Comparison of Missing Data Methods and Software to Fit Incomplete Data Regression Models," The American Statistician, American Statistical Association, vol. 61, pages 79-90, February.
    4. Leeb, Hannes & P tscher, Benedikt M., 2008. "Can One Estimate The Unconditional Distribution Of Post-Model-Selection Estimators?," Econometric Theory, Cambridge University Press, vol. 24(02), pages 338-376, April.
    5. Kabaila, Paul & Leeb, Hannes, 2006. "On the Large-Sample Minimal Coverage Probability of Confidence Intervals After Model Selection," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 619-629, June.
    6. Schomaker Michael & Heumann Christian, 2011. "Model Averaging in Factor Analysis: An Analysis of Olympic Decathlon Data," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 7(1), pages 1-15, January.
    7. Magnus, Jan R. & Wan, Alan T.K. & Zhang, Xinyu, 2011. "Weighted average least squares estimation with nonspherical disturbances and an application to the Hong Kong housing market," Computational Statistics & Data Analysis, Elsevier, vol. 55(3), pages 1331-1341, March.
    8. Wan, Alan T.K. & Zhang, Xinyu & Zou, Guohua, 2010. "Least squares model averaging by Mallows criterion," Journal of Econometrics, Elsevier, vol. 156(2), pages 277-283, June.
    9. Leeb, Hannes & P tscher, Benedikt M., 2005. "Model Selection And Inference: Facts And Fiction," Econometric Theory, Cambridge University Press, vol. 21(01), pages 21-59, February.
    10. Hansen, Bruce E. & Racine, Jeffrey S., 2012. "Jackknife model averaging," Journal of Econometrics, Elsevier, vol. 167(1), pages 38-46.
    11. Liang, Hua & Zou, Guohua & Wan, Alan T. K. & Zhang, Xinyu, 2011. "Optimal Weight Choice for Frequentist Model Average Estimators," Journal of the American Statistical Association, American Statistical Association, vol. 106(495), pages 1053-1066.
    12. Schomaker, Michael & Wan, Alan T.K. & Heumann, Christian, 2010. "Frequentist Model Averaging with missing observations," Computational Statistics & Data Analysis, Elsevier, vol. 54(12), pages 3336-3347, December.
    13. Michael Schomaker, 2012. "Shrinkage averaging estimation," Statistical Papers, Springer, vol. 53(4), pages 1015-1034, November.
    14. Hjort N.L. & Claeskens G., 2003. "Frequentist Model Average Estimators," Journal of the American Statistical Association, American Statistical Association, vol. 98, pages 879-899, January.
    15. repec:taf:jnlbes:v:30:y:2012:i:1:p:132-142 is not listed on IDEAS
    16. Pötscher, Benedikt M., 2006. "The Distribution of Model Averaging Estimators and an Impossibility Result Regarding Its Estimation," MPRA Paper 73, University Library of Munich, Germany, revised Jul 2006.
    17. Fletcher, David & Dillingham, Peter W., 2011. "Model-averaged confidence intervals for factorial experiments," Computational Statistics & Data Analysis, Elsevier, vol. 55(11), pages 3041-3048, November.
    18. Turek, Daniel & Fletcher, David, 2012. "Model-averaged Wald confidence intervals," Computational Statistics & Data Analysis, Elsevier, vol. 56(9), pages 2809-2815.
    19. Bruce E. Hansen, 2007. "Least Squares Model Averaging," Econometrica, Econometric Society, vol. 75(4), pages 1175-1189, 07.
    Full references (including those not matched with items on IDEAS)

    This item is not listed on Wikipedia, on a reading list or among the top items on IDEAS.

    When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:71:y:2014:i:c:p:758-770. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Dana Niculescu)

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If references are entirely missing, you can add them using this form.

    If the full references list an item that is present in RePEc, but the system did not link to it, you can help with this form.

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    This information is provided to you by IDEAS at the Research Division of the Federal Reserve Bank of St. Louis using RePEc data.