IDEAS home Printed from https://ideas.repec.org/a/bla/jorssa/v181y2018i4p927-979.html
   My bibliography  Save this article

From start to finish: a framework for the production of small area official statistics

Author

Listed:
  • Nikos Tzavidis
  • Li‐Chun Zhang
  • Angela Luna
  • Timo Schmid
  • Natalia Rojas‐Perilla

Abstract

Small area estimation is a research area in official and survey statistics of great practical relevance for national statistical institutes and related organizations. Despite rapid developments in methodology and software, researchers and users would benefit from having practical guidelines for the process of small area estimation. We propose a general framework for the production of small area statistics that is governed by the principle of parsimony and is based on three broadly defined stages, namely specification, analysis and adaptation, and evaluation. Emphasis is given to the interaction between a user of small area statistics and the statistician in specifying the target geography and parameters in the light of the available data. Model‐free and model‐dependent methods are described with a focus on model selection and testing, model diagnostics and adaptations such as use of data transformations. Uncertainty measures and the use of model and design‐based simulations for method evaluation are also at the centre of the paper. We illustrate the application of the proposed framework by using real data for the estimation of non‐linear deprivation indicators. Linear statistics, e.g. averages, are included as special cases of the general framework.

Suggested Citation

  • Nikos Tzavidis & Li‐Chun Zhang & Angela Luna & Timo Schmid & Natalia Rojas‐Perilla, 2018. "From start to finish: a framework for the production of small area official statistics," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 181(4), pages 927-979, October.
  • Handle: RePEc:bla:jorssa:v:181:y:2018:i:4:p:927-979
    DOI: 10.1111/rssa.12364
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/rssa.12364
    Download Restriction: no

    File URL: https://libkey.io/10.1111/rssa.12364?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Chris Elbers & Jean O. Lanjouw & Peter Lanjouw, 2003. "Micro--Level Estimation of Poverty and Inequality," Econometrica, Econometric Society, vol. 71(1), pages 355-364, January.
    2. Lumley, Thomas, 2004. "Analysis of Complex Survey Samples," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 9(i08).
    3. Sharon L. Lohr & J. N. K. Rao, 2009. "Jackknife estimation of mean squared error of small area predictors in nonlinear mixed models," Biometrika, Biometrika Trust, vol. 96(2), pages 457-468.
    4. D. Pfeffermann & S. Correa, 2012. "Empirical bootstrap bias correction and estimation of prediction mean square error in small area estimation," Biometrika, Biometrika Trust, vol. 99(2), pages 457-472.
    5. Lidia Ceriani & Paolo Verme, 2012. "The origins of the Gini index: extracts from Variabilità e Mutabilità (1912) by Corrado Gini," The Journal of Economic Inequality, Springer;Society for the Study of Economic Inequality, vol. 10(3), pages 421-443, September.
    6. Timo Schmid & Fabian Bruckschen & Nicola Salvati & Till Zbiranski, 2017. "Constructing sociodemographic indicators for national statistical institutes by using mobile phone data: estimating literacy rates in Senegal," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 180(4), pages 1163-1190, October.
    7. Danny Pfeffermann & Anna Sikov & Richard Tiller, 2014. "Single- and two-stage cross-sectional and time series benchmarking procedures for small area estimation," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 23(4), pages 631-666, December.
    8. Ugarte, M.D. & Goicoa, T. & Militino, A.F. & Durbán, M., 2009. "Spline smoothing in small area trend estimation and forecasting," Computational Statistics & Data Analysis, Elsevier, vol. 53(10), pages 3616-3629, August.
    9. Ray Chambers & Hukum Chandra & Nicola Salvati & Nikos Tzavidis, 2014. "Outlier robust small area estimation," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(1), pages 47-69, January.
    10. Finn Lindgren & Håvard Rue & Johan Lindström, 2011. "An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 73(4), pages 423-498, September.
    11. J. D. Opsomer & G. Claeskens & M. G. Ranalli & G. Kauermann & F. J. Breidt, 2008. "Non‐parametric small area estimation using penalized spline regression," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(1), pages 265-286, February.
    12. Datta, Gauri S. & Hall, Peter & Mandal, Abhyuday, 2011. "Model Selection by Testing for the Presence of Small-Area Effects, and Application to Area-Level Data," Journal of the American Statistical Association, American Statistical Association, vol. 106(493), pages 362-374.
    13. Yolanda Marhuenda & Isabel Molina & Domingo Morales & J. N. K. Rao, 2017. "Poverty mapping in small areas under a twofold nested error regression model," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 180(4), pages 1111-1136, October.
    14. Timo Schmid & Nikos Tzavidis & Ralf Münnich & Ray Chambers, 2016. "Outlier Robust Small-Area Estimation Under Spatial Correlation," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 43(3), pages 806-826, September.
    15. Michael Sverchkov & Danny Pfeffermann, 2018. "Small area estimation under informative sampling and not missing at random non‐response," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 181(4), pages 981-1008, October.
    16. Tara Bedi & Aline Coudouel & Kenneth Simler, 2007. "More Than a Pretty Picture : Using Poverty Maps to Design Better Policies and Interventions," World Bank Publications - Books, The World Bank Group, number 6800, December.
    17. Peter Hall & Tapabrata Maiti, 2006. "On parametric bootstrap methods for small area prediction," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 68(2), pages 221-238, April.
    18. Alfons, Andreas & Templ, Matthias, 2013. "Estimation of Social Exclusion Indicators from Complex Surveys: The R Package laeken," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 54(i15).
    19. Yoshimori, Masayo & Lahiri, Partha, 2014. "A new adjusted maximum likelihood method for the Fay–Herriot small area model," Journal of Multivariate Analysis, Elsevier, vol. 124(C), pages 281-294.
    20. Fabrizi, Enrico & Trivisano, Carlo, 2016. "Small area estimation of the Gini concentration coefficient," Computational Statistics & Data Analysis, Elsevier, vol. 99(C), pages 223-234.
    21. González Manteiga, Wenceslao & Lombardía, María José & Martínez Miranda, María Dolores & Sperlich, Stefan, 2013. "Kernel smoothers and bootstrapping for semiparametric mixed effects models," Journal of Multivariate Analysis, Elsevier, vol. 114(C), pages 288-302.
    22. Lindgren, Finn & Rue, Håvard, 2015. "Bayesian Spatial Modelling with R-INLA," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 63(i19).
    23. Datta, G. S. & Lahiri, P., 1995. "Robust Hierarchical Bayes Estimation of Small Area Characteristics in the Presence of Covariates and Outliers," Journal of Multivariate Analysis, Elsevier, vol. 54(2), pages 310-328, August.
    24. Florin Vaida & Suzette Blanchard, 2005. "Conditional Akaike information for mixed-effects models," Biometrika, Biometrika Trust, vol. 92(2), pages 351-370, June.
    25. Danny Pfeffermann & Anna Sikov & Richard Tiller, 2014. "Rejoinder on: Single- and two-stage cross-sectional and time series benchmarking procedures for small area estimation," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 23(4), pages 686-690, December.
    26. D. H. Judson, 2007. "Information integration for constructing social statistics: history, theory and ideas towards a research programme," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 170(2), pages 483-501, March.
    27. Malay Ghosh & Rebecca Steorts, 2013. "Two-stage benchmarking as applied to small area estimation," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 22(4), pages 670-687, November.
    28. Matthew J. Gurka & Lloyd J. Edwards & Keith E. Muller & Lawrence L. Kupper, 2006. "Extending the Box–Cox transformation to the linear mixed model," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 169(2), pages 273-288, March.
    29. Templ, Matthias & Meindl, Bernhard & Kowarik, Alexander & Dupriez, Olivier, 2017. "Simulation of Synthetic Complex Data: The R Package simPop," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 79(i10).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Tzavidis, Nikos & Zhang, Li-Chun & Luna Hernandez, Angela & Schmid, Timo & Rojas-Perilla, Natalia, 2016. "From start to finish: A framework for the production of small area official statistics," Discussion Papers 2016/13, Free University Berlin, School of Business & Economics.
    2. Paul Walter & Marcus Groß & Timo Schmid & Nikos Tzavidis, 2021. "Domain prediction with grouped income data," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(4), pages 1501-1523, October.
    3. J. N. K. Rao, 2015. "Inferential issues in model-based small area estimation: some new developments," Statistics in Transition new series, Główny Urząd Statystyczny (Polska), vol. 16(4), pages 491-510, December.
    4. J. N. K. Rao, 2015. "Inferential Issues In Model-Based Small Area Estimation: Some New Developments," Statistics in Transition New Series, Polish Statistical Association, vol. 16(4), pages 491-510, December.
    5. Sugasawa, Shonosuke & Kubokawa, Tatsuya, 2017. "Transforming response values in small area prediction," Computational Statistics & Data Analysis, Elsevier, vol. 114(C), pages 47-60.
    6. Isabel Molina & Paul Corral & Minh Nguyen, 2022. "Estimation of poverty and inequality in small areas: review and discussion," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(4), pages 1143-1166, December.
    7. Patrick Krennmair & Timo Schmid, 2022. "Flexible domain prediction using mixed effects random forests," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(5), pages 1865-1894, November.
    8. Sugasawa, Shonosuke & Kubokawa, Tatsuya, 2015. "Parametric transformed Fay–Herriot model for small area estimation," Journal of Multivariate Analysis, Elsevier, vol. 139(C), pages 295-311.
    9. María José Lombardía & Esther López-Vizcaíno & Cristina Rueda, 2021. "Selection model for domains across time: application to labour force survey by economic activities," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(1), pages 228-254, March.
    10. Malay Ghosh, 2020. "Small area estimation: its evolution in five decades," Statistics in Transition New Series, Polish Statistical Association, vol. 21(4), pages 1-22, August.
    11. Natalia Rojas‐Perilla & Sören Pannier & Timo Schmid & Nikos Tzavidis, 2020. "Data‐driven transformations in small area estimation," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 183(1), pages 121-148, January.
    12. Rao J. N. K., 2015. "Inferential Issues in Model-Based Small Area Estimation: Some New Developments," Statistics in Transition New Series, Polish Statistical Association, vol. 16(4), pages 491-510, December.
    13. Zhang Junni L. & Bryant John, 2020. "Fully Bayesian Benchmarking of Small Area Estimation Models," Journal of Official Statistics, Sciendo, vol. 36(1), pages 197-223, March.
    14. Rebecca Steorts & M. Ugarte, 2014. "Comments on: “Single and two-stage cross-sectional and time series benchmarking procedures for small area estimation”," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 23(4), pages 680-685, December.
    15. Bijlsma Ineke & van den Brakel Jan & van der Velden Rolf & Allen Jim, 2020. "Estimating Literacy Levels at a Detailed Regional Level: an Application Using Dutch Data," Journal of Official Statistics, Sciendo, vol. 36(2), pages 251-274, June.
    16. Benavent, Roberto & Morales, Domingo, 2016. "Multivariate Fay–Herriot models for small area estimation," Computational Statistics & Data Analysis, Elsevier, vol. 94(C), pages 372-390.
    17. María José Lombardía & Esther López‐Vizcaíno & Cristina Rueda, 2017. "Mixed generalized Akaike information criterion for small area models," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 180(4), pages 1229-1252, October.
    18. Stefano Marchetti & Maciej Beręsewicz & Nicola Salvati & Marcin Szymkowiak & Łukasz Wawrowski, 2018. "The use of a three‐level M‐quantile model to map poverty at local administrative unit 1 in Poland," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 181(4), pages 1077-1104, October.
    19. Militino, A.F. & Goicoa, T. & Ugarte, M.D., 2012. "Estimating the percentage of food expenditure in small areas using bias-corrected P-spline based estimators," Computational Statistics & Data Analysis, Elsevier, vol. 56(10), pages 2934-2948.
    20. Chakraborty Adrijo & Datta Gauri Sankar & Mandal Abhyuday, 2016. "A Two-Component Normal Mixture Alternative to the Fay-Herriot Model," Statistics in Transition New Series, Polish Statistical Association, vol. 17(1), pages 67-90, March.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jorssa:v:181:y:2018:i:4:p:927-979. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/rssssea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.