IDEAS home Printed from https://ideas.repec.org/a/spr/sankhb/v83y2021i1d10.1007_s13571-020-00227-w.html
   My bibliography  Save this article

On Making Valid Inferences by Integrating Data from Surveys and Other Sources

Author

Listed:
  • J. N. K. Rao

    (Carleton University)

Abstract

Survey samplers have long been using probability samples from one or more sources in conjunction with census and administrative data to make valid and efficient inferences on finite population parameters. This topic has received a lot of attention more recently in the context of data from non-probability samples such as transaction data, web surveys and social media data. In this paper, I will provide a brief overview of probability sampling methods first and then discuss some recent methods, based on models for the non-probability samples, which could lead to useful inferences from a non-probability sample by itself or when combined with a probability sample. I will also explain how big data may be used as predictors in small area estimation, a topic of current interest because of the growing demand for reliable local area statistics.

Suggested Citation

  • J. N. K. Rao, 2021. "On Making Valid Inferences by Integrating Data from Surveys and Other Sources," Sankhya B: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 83(1), pages 242-272, May.
  • Handle: RePEc:spr:sankhb:v:83:y:2021:i:1:d:10.1007_s13571-020-00227-w
    DOI: 10.1007/s13571-020-00227-w
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s13571-020-00227-w
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s13571-020-00227-w?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Jerome P. Reiter, 2008. "Multiple imputation when records used for imputation are not used or disseminated for analysis," Biometrika, Biometrika Trust, vol. 95(4), pages 933-946.
    2. Robert M. Groves & Steven G. Heeringa, 2006. "Responsive design for household surveys: tools for actively controlling survey errors and costs," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 169(3), pages 439-457, July.
    3. Wu C. & Sitter R. R, 2001. "A Model-Calibration Approach to Using Complete Auxiliary Information From Survey Data," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 185-193, March.
    4. Lynn M. R. Ybarra & Sharon L. Lohr, 2008. "Small area estimation when auxiliary information is measured with error," Biometrika, Biometrika Trust, vol. 95(4), pages 919-931.
    5. Roger Tourangeau & J. Michael Brick & Sharon Lohr & Jane Li, 2017. "Adaptive and responsive survey designs: a review and assessment," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 180(1), pages 203-223, January.
    6. Graham Kalton, 2019. "Developments in Survey Research over the Past 60 Years: A Personal Perspective," International Statistical Review, International Statistical Institute, vol. 87(S1), pages 10-30, May.
    7. Jae Kwang Kim & J. N. K. Rao, 2012. "Combining data from two independent surveys: a model-assisted approach," Biometrika, Biometrika Trust, vol. 99(1), pages 85-100.
    8. Jae Kwang Kim & Zhonglei Wang, 2019. "Sampling Techniques for Big Data Analysis," International Statistical Review, International Statistical Institute, vol. 87(S1), pages 177-191, May.
    9. Mary E. Thompson, 2019. "Combining Data from New and Traditional Sources in Population Surveys," International Statistical Review, International Statistical Institute, vol. 87(S1), pages 79-89, May.
    10. Niels Keiding & Thomas A. Louis, 2016. "Perils and potentials of self-selected entry to epidemiological studies and surveys," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 179(2), pages 319-376, February.
    11. Pfeffermann, Danny & Sverchkov, Michail, 2007. "Small-Area Estimation Under Informative Probability Sampling of Areas and Within the Selected Areas," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 1427-1439, December.
    12. Alessio Guandalini & Yves Tillé, 2017. "Design-based Estimators Calibrated on Estimated Totals from Multiple Surveys," International Statistical Review, International Statistical Institute, vol. 85(2), pages 250-269, August.
    13. Timo Schmid & Fabian Bruckschen & Nicola Salvati & Till Zbiranski, 2017. "Constructing sociodemographic indicators for national statistical institutes by using mobile phone data: estimating literacy rates in Senegal," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 180(4), pages 1163-1190, October.
    14. Wang, Wei & Rothschild, David & Goel, Sharad & Gelman, Andrew, 2015. "Forecasting elections with non-representative polls," International Journal of Forecasting, Elsevier, vol. 31(3), pages 980-991.
    15. Holt, D. Tim, 2007. "The Official Statistics Olympic Challenge: Wider, Deeper, Quicker, Better, Cheaper," The American Statistician, American Statistical Association, vol. 61, pages 1-8, February.
    16. A. I. McLeod & D. R. Bellhouse, 1983. "A Convenient Algorithm for Drawing a Simple Random Sample," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 32(2), pages 182-184, June.
    17. Sixia Chen & David Haziza, 2017. "Multiply robust imputation procedures for the treatment of item nonresponse in surveys," Biometrika, Biometrika Trust, vol. 104(2), pages 439-453.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Medous, Estelle & Goga, Camelia & Ruiz-Gazen, Anne & Beaumont, Jean-François & Dessertaine, Alain & Puech, Pauline, 2022. "QR Prediction for Statistical Data Integration," TSE Working Papers 22-1344, Toulouse School of Economics (TSE).
    2. Ieva Burakauskaitė & Andrius Čiginas, 2023. "An Approach to Integrating a Non-Probability Sample in the Population Census," Mathematics, MDPI, vol. 11(8), pages 1-14, April.
    3. Camilla Salvatore, 2023. "Inference with non-probability samples and survey data integration: a science mapping study," METRON, Springer;Sapienza Università di Roma, vol. 81(1), pages 83-107, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Bijlsma Ineke & van den Brakel Jan & van der Velden Rolf & Allen Jim, 2020. "Estimating Literacy Levels at a Detailed Regional Level: an Application Using Dutch Data," Journal of Official Statistics, Sciendo, vol. 36(2), pages 251-274, June.
    2. J. N. K. Rao, 2015. "Inferential issues in model-based small area estimation: some new developments," Statistics in Transition new series, Główny Urząd Statystyczny (Polska), vol. 16(4), pages 491-510, December.
    3. van Berkel Kees & van der Doef Suzanne & Schouten Barry, 2020. "Implementing Adaptive Survey Design With an Application to the Dutch Health Survey," Journal of Official Statistics, Sciendo, vol. 36(3), pages 609-629, September.
    4. Denis Devaud & Yves Tillé, 2019. "Deville and Särndal’s calibration: revisiting a 25-years-old successful optimization problem," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(4), pages 1033-1065, December.
    5. Stephanie Coffey, PhD. & Jaya Damineni & John Eltinge, PhD. & Anup Mathur, PhD. & Kayla Varela & Allison Zotti, 2023. "Some Open Questions on Multiple-Source Extensions of Adaptive-Survey Design Concepts and Methods," Working Papers 23-03, Center for Economic Studies, U.S. Census Bureau.
    6. Wagner James & West Brady T. & Elliott Michael R. & Coffey Stephanie, 2020. "Comparing the Ability of Regression Modeling and Bayesian Additive Regression Trees to Predict Costs in a Responsive Survey Design Context," Journal of Official Statistics, Sciendo, vol. 36(4), pages 907-931, December.
    7. Tobias Gummer & Pablo Christmann & Sascha Verhoeven & Christof Wolf, 2022. "Using a responsive survey design to innovate self‐administered mixed‐mode surveys," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(3), pages 916-932, July.
    8. Changbao Wu & Shixiao Zhang, 2019. "Comments on: Deville and Särndal’s calibration: revisiting a 25 years old successful optimization problem," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(4), pages 1082-1086, December.
    9. J. N. K. Rao, 2015. "Inferential Issues In Model-Based Small Area Estimation: Some New Developments," Statistics in Transition New Series, Polish Statistical Association, vol. 16(4), pages 491-510, December.
    10. Särndal Carl-Erik & Lundquist Peter, 2017. "Inconsistent Regression and Nonresponse Bias: Exploring Their Relationship as a Function of Response Imbalance," Journal of Official Statistics, Sciendo, vol. 33(3), pages 709-734, September.
    11. Peisong Han & Linglong Kong & Jiwei Zhao & Xingcai Zhou, 2019. "A general framework for quantile estimation with incomplete data," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 81(2), pages 305-333, April.
    12. Isabel Molina & Paul Corral & Minh Nguyen, 2022. "Estimation of poverty and inequality in small areas: review and discussion," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(4), pages 1143-1166, December.
    13. Seho Park & Jae Kwang Kim & Diana Stukel, 2017. "A measurement error model approach to survey data integration: combining information from two surveys," METRON, Springer;Sapienza Università di Roma, vol. 75(3), pages 345-357, December.
    14. Brick J. Michael & Tourangeau Roger, 2017. "Responsive Survey Designs for Reducing Nonresponse Bias," Journal of Official Statistics, Sciendo, vol. 33(3), pages 735-752, September.
    15. John L. Czajka & Amy Beyler, "undated". "Declining Response Rates in Federal Surveys: Trends and Implications (Background Paper)," Mathematica Policy Research Reports a714f76e878f4a74a6ad9f15d, Mathematica Policy Research.
    16. Alleva Giorgio & Petrarca Francesca & Falorsi Piero Demetrio & Righi Paolo, 2021. "Measuring the Accuracy of Aggregates Computed from a Statistical Register," Journal of Official Statistics, Sciendo, vol. 37(2), pages 481-503, June.
    17. Burger Joep & Perryck Koen & Schouten Barry, 2017. "Robustness of Adaptive Survey Designs to Inaccuracy of Design Parameters," Journal of Official Statistics, Sciendo, vol. 33(3), pages 687-708, September.
    18. Alessio Guandalini & Claudio Ceccarelli, 2022. "Impact measurement and dimension reduction of auxiliary variables in calibration estimator using the Shapley decomposition," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 31(4), pages 759-784, October.
    19. Chen, Sixia & Haziza, David, 2023. "A unified framework of multiply robust estimation approaches for handling incomplete data," Computational Statistics & Data Analysis, Elsevier, vol. 179(C).
    20. Debashis Ghosh & Michael S. Sabel, 2022. "A Weighted Sample Framework to Incorporate External Calculators for Risk Modeling," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 14(3), pages 363-379, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:sankhb:v:83:y:2021:i:1:d:10.1007_s13571-020-00227-w. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.