IDEAS home Printed from https://ideas.repec.org/a/pab/rmcpee/v7y2009i1p3-30.html
   My bibliography  Save this article

Métodos de imputación para el tratamiento de datos faltantes: aplicación mediante R/Splus = Imputation methods to handle the problem of missing data: an application using R/Splus

Author

Listed:
  • Muñoz Rosas, Juan Francisco

    (Departamento de Métodos Cuantitativos para la Economía y la Empresa. Universidad de Granada)

  • Alvarez Verdejo, Encarnación

    (Departamento de Métodos Cuantitativos para la Economía y la Empresa. Universidad de Granada)

Abstract

La aparición de datos faltantes es un problema común en la mayoría de las encuestas llevadas a cabo en distintos ámbitos. Una técnica tradicional y muy conocida para el tratamiento de datos faltantes es la imputación. La mayoría de los estudios relacionados con los métodos de imputación se centran en el problema de la estimación de la media y su varianza y están basados en diseños muestrales simples tales como el muestreo aleatorio simple. En este trabajo se describen los métodos de imputación más conocidos y se plantean bajo el contexto de un diseño muestral general y para el caso de diferentes mecanismos de respuesta. Mediante estudios de simulación Monte Carlo basados en datos reales extraídos del ámbito de la economía y la empresa, analizamos las propiedades de varios métodos de imputación en la estimación de otros parámetros que también son utilizados con frecuencia en la práctica, como son las funciones de distribución y los cuantiles. Con el fin de que los métodos de imputación descritos en este trabajo se puedan implementar y usar con mayor facilidad, se proporcionan sus códigos en los lenguajes de programación R y Splus. = Missing values are a common problem in many sampling surveys, and imputation is usually employed to compensate for non-response. Most imputation methods are based upon the problem of the mean estimation and its variance, and they also assume simple sampling designs such as the simple random sampling without replacement. In this paper we describe some imputation methods and define them under a general sampling design. Different response mechanisms are also discussed. Assuming some populations based upon real data extracted from the context of the economy and business, Monte Carlo simulations are carried out to analyze the properties of the various imputation methods in the estimation of parameters such as distribution functions and quantiles. The various imputation methods are implemented using the popular statistical softwares R and Splus, and codes are here presented.

Suggested Citation

  • Muñoz Rosas, Juan Francisco & Alvarez Verdejo, Encarnación, 2009. "Métodos de imputación para el tratamiento de datos faltantes: aplicación mediante R/Splus = Imputation methods to handle the problem of missing data: an application using R/Splus," Revista de Métodos Cuantitativos para la Economía y la Empresa = Journal of Quantitative Methods for Economics and Business Administration, Universidad Pablo de Olavide, Department of Quantitative Methods for Economics and Business Administration, vol. 7(1), pages 3-30, June.
  • Handle: RePEc:pab:rmcpee:v:7:y:2009:i:1:p:3-30
    as

    Download full text from publisher

    File URL: http://www.upo.es/RevMetCuant/art25.pdf
    Download Restriction: no

    File URL: http://www.upo.es/RevMetCuant/art25.txt
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Yves G. Berger & J. N. K. Rao, 2006. "Adjusted jackknife for imputation under unequal probability sampling without replacement," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 68(3), pages 531-547, June.
    2. Yves G. Berger & Chris J. Skinner, 2003. "Variance estimation for a low income proportion," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 52(4), pages 457-468, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Tim Goedemé & Karel Van den Bosch & Lina Salanauskaite & Gerlinde Verbist, 2013. "Testing the Statistical Significance of Microsimulation Results: Often Easier than You Think. A Technical Note," ImPRovE Working Papers 13/10, Herman Deleeck Centre for Social Policy, University of Antwerp.
    2. Tim Goedemé & Lorena Zardo Trindade & Frank Vandenbroucke, 2017. "A Pan-European Perspective on Low-Income Dynamics in the EU," Working Papers 1703, Herman Deleeck Centre for Social Policy, University of Antwerp.
    3. Tim Goedemé, 2013. "How much Confidence can we have in EU-SILC? Complex Sample Designs and the Standard Error of the Europe 2020 Poverty Indicators," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 110(1), pages 89-110, January.
    4. Frank A. Cowell & Emmanuel Flachaire, 2014. "Statistical Methods for Distributional Analysis," Working Papers halshs-01115996, HAL.
    5. Tim Goedemé & Diego Collado, 2016. "The EU Convergence Machine at Work. To the Benefit of the EU's Poorest Citizens?," Journal of Common Market Studies, Wiley Blackwell, vol. 54(5), pages 1142-1158, September.
    6. Tim Goedemé & Karel Van den Bosch & Lina Salanauskaite & Gerlinde Verbist, 2013. "Testing the Statistical Significance of Microsimulation Results: A Plea," International Journal of Microsimulation, International Microsimulation Association, vol. 6(3), pages 50-77.
    7. Berger, Yves G. & Muñoz, Juan F. & Rancourt, Eric, 2009. "Variance estimation of survey estimates calibrated on estimated control totals--An application to the extended regression estimator and the regression composite estimator," Computational Statistics & Data Analysis, Elsevier, vol. 53(7), pages 2596-2604, May.
    8. Rebecca R. Andridge & Roderick J. A. Little, 2010. "A Review of Hot Deck Imputation for Survey Non‐response," International Statistical Review, International Statistical Institute, vol. 78(1), pages 40-64, April.
    9. Mike Brewer & Liam Wren-Lewis, 2016. "Accounting for Changes in Income Inequality: Decomposition Analyses for the UK, 1978–2008," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 78(3), pages 289-322, June.
    10. Gianni Betti & Francesca Gagliardi, 2018. "Extension of JRR Method for Variance Estimation of Net Changes in Inequality Measures," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 137(1), pages 45-60, May.
    11. Michal Brzezinski, 2011. "Variance Estimation for Richness Measures," LWS Working papers 11, LIS Cross-National Data Center in Luxembourg.
    12. J. Muñoz & E. Álvarez-Verdejo & R. García-Fernández & L. Barroso, 2015. "Efficient Estimation of the Headcount Index," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 123(3), pages 713-732, September.
    13. Yuyin Shi & Bing Liu & Gengsheng Qin, 2020. "Influence function-based empirical likelihood and generalized confidence intervals for the Lorenz curve," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 29(3), pages 427-446, September.
    14. Goedemé, Tim & Decerf, Benoit & Van den Bosch, Karel, 2020. "A new poverty indicator for Europe: the extended headcount ratio," INET Oxford Working Papers 2020-26, Institute for New Economic Thinking at the Oxford Martin School, University of Oxford.
    15. Yves G. Berger, 2020. "An empirical likelihood approach under cluster sampling with missing observations," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 72(1), pages 91-121, February.
    16. Vijay Verma & Gianni Betti, 2011. "Taylor linearization sampling errors and design effects for poverty measures and other complex statistics," Journal of Applied Statistics, Taylor & Francis Journals, vol. 38(8), pages 1549-1576, August.

    More about this item

    Keywords

    información auxiliar; encuesta; probabilidades de inclusión; mecanismo de respuesta; auxiliary information; survey; inclusion probabilities; response mechanism;
    All these keywords.

    JEL classification:

    • C13 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Estimation: General
    • C15 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Statistical Simulation Methods: General
    • C80 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - General

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:pab:rmcpee:v:7:y:2009:i:1:p:3-30. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Publicación Digital - UPO (email available below). General contact details of provider: https://edirc.repec.org/data/dmupoes.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.