IDEAS home Printed from https://ideas.repec.org/a/bla/jorssa/v185y2022i4p1733-1756.html
   My bibliography  Save this article

Sample design for analysis using high‐influence probability sampling

Author

Listed:
  • Robert G. Clark
  • David G. Steel

Abstract

Sample designs are typically developed to estimate summary statistics such as means, proportions and prevalences. Analytical outputs may also be a priority but there are fewer methods and results on how to efficiently design samples for the fitting and estimation of statistical models. This paper develops a general approach for determining efficient sampling designs for probability‐weighted maximum likelihood estimators and considers application to generalized linear models. We allow for non‐ignorable sampling, including outcome‐dependent sampling. The new designs have probabilities of selection closely related to influence statistics such as dfbeta and Cook's distance. The new approach is shown to perform well in a simulation based on data from the New Zealand Health Survey.

Suggested Citation

  • Robert G. Clark & David G. Steel, 2022. "Sample design for analysis using high‐influence probability sampling," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(4), pages 1733-1756, October.
  • Handle: RePEc:bla:jorssa:v:185:y:2022:i:4:p:1733-1756
    DOI: 10.1111/rssa.12916
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/rssa.12916
    Download Restriction: no

    File URL: https://libkey.io/10.1111/rssa.12916?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Oǧuz-Alper, Melike & Berger, Yves G., 2020. "Modelling multilevel data under complex sampling designs: An empirical likelihood approach," Computational Statistics & Data Analysis, Elsevier, vol. 145(C).
    2. Sophia Rabe‐Hesketh & Anders Skrondal, 2006. "Multilevel modelling of complex survey data," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 169(4), pages 805-827, October.
    3. Jae Kwang Kim & C. J. Skinner, 2013. "Weighting in survey analysis under informative sampling," Biometrika, Biometrika Trust, vol. 100(2), pages 385-398.
    4. D. Pfeffermann & C. J. Skinner & D. J. Holmes & H. Goldstein & J. Rasbash, 1998. "Weighting for unequal selection probabilities in multilevel models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 60(1), pages 23-40.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Woojin Chung & Roeul Kim, 2020. "A Reversal of the Association between Education Level and Obesity Risk during Ageing: A Gender-Specific Longitudinal Study in South Korea," IJERPH, MDPI, vol. 17(18), pages 1-19, September.
    2. Patricia Dörr & Jan Pablo Burgard, 2019. "Data-driven transformations and survey-weighting for linear mixed models," Research Papers in Economics 2019-16, University of Trier, Department of Economics.
    3. Joseph L Dieleman & Tara Templin, 2014. "Random-Effects, Fixed-Effects and the within-between Specification for Clustered Data in Observational Health Studies: A Simulation Study," PLOS ONE, Public Library of Science, vol. 9(10), pages 1-17, October.
    4. Woojin Chung & Roeul Kim, 2020. "Differential Risk of Cognitive Impairment across Paid and Unpaid Occupations in the Middle-Age Population: Evidence from the Korean Longitudinal Study of Aging, 2006–2016," IJERPH, MDPI, vol. 17(9), pages 1-14, April.
    5. Laura M. Stapleton & Yoonjeong Kang, 2018. "Design Effects of Multilevel Estimates From National Probability Samples," Sociological Methods & Research, , vol. 47(3), pages 430-457, August.
    6. Woojin Chung & Roeul Kim, 2020. "Which Occupation is Highly Associated with Cognitive Impairment? A Gender-Specific Longitudinal Study of Paid and Unpaid Occupations in South Korea," IJERPH, MDPI, vol. 17(21), pages 1-17, October.
    7. Nora Würz & Timo Schmid & Nikos Tzavidis, 2022. "Estimating regional income indicators under transformations and access to limited population auxiliary information," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(4), pages 1679-1706, October.
    8. Francesco Schirripa Spagnolo & Nicola Salvati & Antonella D’Agostino & Ides Nicaise, 2020. "The use of sampling weights in M‐quantile random‐effects regression: an application to Programme for International Student Assessment mathematics scores," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 69(4), pages 991-1012, August.
    9. Bowen, Mary Elizabeth, 2009. "Childhood socioeconomic status and racial differences in disability: Evidence from the Health and Retirement Study (1998-2006)," Social Science & Medicine, Elsevier, vol. 69(3), pages 433-441, August.
    10. Ana Maria Osorio & Catalina Bolancé & Nyovane Madise & Katharina Rathmann, 2013. "Social Determinants of Child Health in Colombia: Can Community Education Moderate the Effect of Family Characteristics?," Working Papers XREAP2013-02, Xarxa de Referència en Economia Aplicada (XREAP), revised Mar 2013.
    11. Amini, Chiara & Nivorozhkin, Eugene, 2015. "The urban–rural divide in educational outcomes: Evidence from Russia," International Journal of Educational Development, Elsevier, vol. 44(C), pages 118-133.
    12. Yergeau, Marie-Eve, 2020. "Tourism and local welfare: A multilevel analysis in Nepal’s protected areas," World Development, Elsevier, vol. 127(C).
    13. Glen McGee & Jonathan Schildcrout & Sharon‐Lise Normand & Sebastien Haneuse, 2020. "Outcome‐dependent sampling in cluster‐correlated data settings with application to hospital profiling," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 183(1), pages 379-402, January.
    14. Oǧuz-Alper, Melike & Berger, Yves G., 2020. "Modelling multilevel data under complex sampling designs: An empirical likelihood approach," Computational Statistics & Data Analysis, Elsevier, vol. 145(C).
    15. Jae Kwang Kim & J.N.K. Rao & Yonghyun Kwon, 2022. "Analysis of clustered survey data based on two‐stage informative sampling and associated two‐level models," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(4), pages 1522-1540, October.
    16. Carla Cristina Rosa de Almeida & João Policarpo Rodrigues Lima & Maria Fernanda Freire Gatto, 2020. "Expenditure on cultural events: preferences or opportunities? An analysis of Brazilian consumer data," Journal of Cultural Economics, Springer;The Association for Cultural Economics International, vol. 44(3), pages 451-480, September.
    17. Joelle Abramowitz & Brett O'Hara & Darcy Steeg Morris, 2017. "Risking Life and Limb: Estimating a Measure of Medical Care Economic Risk and Considering its Implications," Health Economics, John Wiley & Sons, Ltd., vol. 26(4), pages 469-485, April.
    18. Yue Jia & Lynne Stokes & Ian Harris & Yan Wang, 2011. "Performance of Random Effects Model Estimators Under Complex Sampling Designs," Journal of Educational and Behavioral Statistics, , vol. 36(1), pages 6-32, February.
    19. Jan Pablo Burgard & Patricia Dörr, 2018. "Survey-weighted Generalized Linear Mixed Models," Research Papers in Economics 2018-01, University of Trier, Department of Economics.
    20. repec:gdk:wpaper:51 is not listed on IDEAS
    21. Per Strömblad & Gunnar Myrberg, 2013. "Urban Inequality and Political Recruitment," Urban Studies, Urban Studies Journal Limited, vol. 50(5), pages 1049-1065, April.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jorssa:v:185:y:2022:i:4:p:1733-1756. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/rssssea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.