IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0158120.html
   My bibliography  Save this article

How Big of a Problem is Analytic Error in Secondary Analyses of Survey Data?

Author

Listed:
  • Brady T West
  • Joseph W Sakshaug
  • Guy Alain S Aurelien

Abstract

Secondary analyses of survey data collected from large probability samples of persons or establishments further scientific progress in many fields. The complex design features of these samples improve data collection efficiency, but also require analysts to account for these features when conducting analysis. Unfortunately, many secondary analysts from fields outside of statistics, biostatistics, and survey methodology do not have adequate training in this area, and as a result may apply incorrect statistical methods when analyzing these survey data sets. This in turn could lead to the publication of incorrect inferences based on the survey data that effectively negate the resources dedicated to these surveys. In this article, we build on the results of a preliminary meta-analysis of 100 peer-reviewed journal articles presenting analyses of data from a variety of national health surveys, which suggested that analytic errors may be extremely prevalent in these types of investigations. We first perform a meta-analysis of a stratified random sample of 145 additional research products analyzing survey data from the Scientists and Engineers Statistical Data System (SESTAT), which describes features of the U.S. Science and Engineering workforce, and examine trends in the prevalence of analytic error across the decades used to stratify the sample. We once again find that analytic errors appear to be quite prevalent in these studies. Next, we present several example analyses of real SESTAT data, and demonstrate that a failure to perform these analyses correctly can result in substantially biased estimates with standard errors that do not adequately reflect complex sample design features. Collectively, the results of this investigation suggest that reviewers of this type of research need to pay much closer attention to the analytic methods employed by researchers attempting to publish or present secondary analyses of survey data.

Suggested Citation

  • Brady T West & Joseph W Sakshaug & Guy Alain S Aurelien, 2016. "How Big of a Problem is Analytic Error in Secondary Analyses of Survey Data?," PLOS ONE, Public Library of Science, vol. 11(6), pages 1-29, June.
  • Handle: RePEc:plo:pone00:0158120
    DOI: 10.1371/journal.pone.0158120
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0158120
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0158120&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0158120?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Little R.J., 2004. "To Model or Not To Model? Competing Modes of Inference for Finite Population Sampling," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 546-556, January.
    2. Sakshaug, J.W. & West, B.T., 2014. "Important considerations when analyzing health survey data collected using a complex sample design," American Journal of Public Health, American Public Health Association, vol. 104(1), pages 15-16.
    3. D. Pfeffermann & C. J. Skinner & D. J. Holmes & H. Goldstein & J. Rasbash, 1998. "Weighting for unequal selection probabilities in multilevel models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 60(1), pages 23-40.
    4. Brady T. West & Sean Esteban McCabe, 2012. "Incorporating complex sample design effects when only final survey weights are available," Stata Journal, StataCorp LLC, vol. 12(4), pages 718-725, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. West Brady T. & Sakshaug Joseph W. & Aurelien Guy Alain S., 2018. "Accounting for Complex Sampling in Survey Estimation: A Review of Current Software Tools," Journal of Official Statistics, Sciendo, vol. 34(3), pages 721-752, September.
    2. Kenneth Owusu Ansah & Nutifafa Eugene Yaw Dey & Abigail Esinam Adade & Pascal Agbadi, 2022. "Determinants of life satisfaction among Ghanaians aged 15 to 49 years: A further analysis of the 2017/2018 Multiple Cluster Indicator Survey," PLOS ONE, Public Library of Science, vol. 17(1), pages 1-18, January.
    3. Brady T. West & Joseph W. Sakshaug, 2017. "The Need to Account for Complex Sampling Features when Analyzing Establishment Survey Data: An Illustration using the 2013 Business Research and Development and Innovation Survey (BRDIS)," Working Papers 17-62, Center for Economic Studies, U.S. Census Bureau.
    4. Yasmin S. Cypel & Shira Maguen & Paul A. Bernhard & William J. Culpepper & Aaron I. Schneiderman, 2024. "Prevalence and Correlates of Food and/or Housing Instability among Men and Women Post-9/11 US Veterans," IJERPH, MDPI, vol. 21(3), pages 1-16, March.
    5. Rat für Sozial- und Wirtschaftsdaten RatSWD (ed.), 2023. "Erhebung und Nutzung unstrukturierter Daten in den Sozial-, Verhaltens- und Wirtschaftswissenschaften," RatSWD Output Series, German Data Forum (RatSWD), volume 7, number 7-2de, November.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. West Brady T. & Sakshaug Joseph W. & Aurelien Guy Alain S., 2018. "Accounting for Complex Sampling in Survey Estimation: A Review of Current Software Tools," Journal of Official Statistics, Sciendo, vol. 34(3), pages 721-752, September.
    2. Kunihama, T. & Herring, A.H. & Halpern, C.T. & Dunson, D.B., 2016. "Nonparametric Bayes modeling with sample survey weights," Statistics & Probability Letters, Elsevier, vol. 113(C), pages 41-48.
    3. Marivoet, Wim & De Herdt, Tom, 2017. "From figures to facts: making sense of socio-economic surveys in the Democratic Republic of the Congo (DRC)," IOB Analyses & Policy Briefs 23, Universiteit Antwerpen, Institute of Development Policy (IOB).
    4. J. Andrew Royle, 2009. "Analysis of Capture–Recapture Models with Individual Covariates Using Data Augmentation," Biometrics, The International Biometric Society, vol. 65(1), pages 267-274, March.
    5. Carrington C. J. Shepherd & Holly D. Clifford & Francis Mitrou & Shannon M. Melody & Ellen J. Bennett & Fay H. Johnston & Luke D. Knibbs & Gavin Pereira & Janessa L. Pickering & Teck H. Teo & Lea-Ann , 2019. "The Contribution of Geogenic Particulate Matter to Lung Disease in Indigenous Children," IJERPH, MDPI, vol. 16(15), pages 1-12, July.
    6. David Kaplan & Chansoon Lee, 2018. "Optimizing Prediction Using Bayesian Model Averaging: Examples Using Large-Scale Educational Assessments," Evaluation Review, , vol. 42(4), pages 423-457, August.
    7. Patricia Dörr & Jan Pablo Burgard, 2019. "Data-driven transformations and survey-weighting for linear mixed models," Research Papers in Economics 2019-16, University of Trier, Department of Economics.
    8. Bijak Jakub & Bryant Johan & Gołata Elżbieta & Smallwood Steve, 2021. "Preface," Journal of Official Statistics, Sciendo, vol. 37(3), pages 533-541, September.
    9. Shepherd, Carrington CJ & Li, Jianghong & Mitrou, Francis & Zubrick, Stephen R., 2012. "Socioeconomic disparities in the mental health of Indigenous children in Western Australia," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 12, pages 1-1.
    10. Jorge Walter & Daniel Z. Levin & J. Keith Murnighan, 2015. "Reconnection Choices: Selecting the Most Valuable (vs. Most Preferred) Dormant Ties," Organization Science, INFORMS, vol. 26(5), pages 1447-1465, October.
    11. Jennings, Jacky M. & Hensel, Devon J. & Tanner, Amanda E. & Reilly, Meredith L. & Ellen, Jonathan M., 2014. "Are social organizational factors independently associated with a current bacterial sexually transmitted infection among urban adolescents and young adults?," Social Science & Medicine, Elsevier, vol. 118(C), pages 52-60.
    12. Joseph L Dieleman & Tara Templin, 2014. "Random-Effects, Fixed-Effects and the within-between Specification for Clustered Data in Observational Health Studies: A Simulation Study," PLOS ONE, Public Library of Science, vol. 9(10), pages 1-17, October.
    13. Elbers,Chris & Roy Van der Weide, 2025. "Non-Normal Empirical Bayes Prediction of Local Welfare," Policy Research Working Paper Series 11107, The World Bank.
    14. Laura M. Stapleton & Yoonjeong Kang, 2018. "Design Effects of Multilevel Estimates From National Probability Samples," Sociological Methods & Research, , vol. 47(3), pages 430-457, August.
    15. Woojin Chung & Roeul Kim, 2020. "Which Occupation is Highly Associated with Cognitive Impairment? A Gender-Specific Longitudinal Study of Paid and Unpaid Occupations in South Korea," IJERPH, MDPI, vol. 17(21), pages 1-17, October.
    16. Tenan, Simone & Rotger Vallespir, Andreu & Igual, José Manuel & Moya, Óscar & Royle, J. Andrew & Tavecchia, Giacomo, 2013. "Population abundance, size structure and sex-ratio in an insular lizard," Ecological Modelling, Elsevier, vol. 267(C), pages 39-47.
    17. Nora Würz & Timo Schmid & Nikos Tzavidis, 2022. "Estimating regional income indicators under transformations and access to limited population auxiliary information," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(4), pages 1679-1706, October.
    18. Shixiao Zhang & Peisong Han & Changbao Wu, 2023. "Calibration Techniques Encompassing Survey Sampling, Missing Data Analysis and Causal Inference," International Statistical Review, International Statistical Institute, vol. 91(2), pages 165-192, August.
    19. Hwanhee Hong & Kara E. Rudolph & Elizabeth A. Stuart, 2017. "Bayesian Approach for Addressing Differential Covariate Measurement Error in Propensity Score Methods," Psychometrika, Springer;The Psychometric Society, vol. 82(4), pages 1078-1096, December.
    20. Geoffrey Jones & Wesley O. Johnson, 2014. "Prior Elicitation: Interactive Spreadsheet Graphics With Sliders Can Be Fun, and Informative," The American Statistician, Taylor & Francis Journals, vol. 68(1), pages 42-51, February.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0158120. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.