IDEAS home Printed from https://ideas.repec.org/p/tse/wpaper/127048.html
   My bibliography  Save this paper

QR Prediction for Statistical Data Integration

Author

Listed:
  • Medous, Estelle
  • Goga, Camelia
  • Ruiz-Gazen, Anne
  • Beaumont, Jean-François
  • Dessertaine, Alain
  • Puech, Pauline

Abstract

n this paper, we investigate how a big non-probability database can be used to improve estimates from a small probability sample through data integration techniques. In the situation where the study variable is observed in both data sources, Kim and Tam (2021) proposed two design-consistent estimators that can be justified through dual frame survey theory. First, we provide conditions ensuring that these estimators are more eÿcient than the Horvitz-Thompson estimator when the probability sample is selected using either Poisson sampling or simple random sampling without replacement. Then, we study the class of QR predictors, proposed by Särndal and Wright (1984) to handle the case where the non-probability database contains auxiliary variables but no study variable. We provide conditions ensuring that the QR predictor is asymptotically design-unbiased. Assuming the probability sampling design is not informative, the QR predictor is also model-unbiased regardless of the validity of those conditions. We compare the design properties of di˙erent predictors, in the class of QR predictors, through a simulation study. They include a model-based predictor, a model-assisted estimator and a cosmetic estimator. In our simulation setups, the cosmetic estimator performed slightly better than the model-assisted estimator. As expected, the model-based predictor did not perform well when the underlying model was misspecified.

Suggested Citation

  • Medous, Estelle & Goga, Camelia & Ruiz-Gazen, Anne & Beaumont, Jean-François & Dessertaine, Alain & Puech, Pauline, 2022. "QR Prediction for Statistical Data Integration," TSE Working Papers 22-1344, Toulouse School of Economics (TSE).
  • Handle: RePEc:tse:wpaper:127048
    as

    Download full text from publisher

    File URL: https://www.tse-fr.eu/sites/default/files/TSE/documents/doc/wp/2022/wp_tse_1344.pdf
    File Function: Full Text
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. J. N. K. Rao, 2021. "On Making Valid Inferences by Integrating Data from Surveys and Other Sources," Sankhya B: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 83(1), pages 242-272, May.
    2. Jae‐Kwang Kim & Siu‐Ming Tam, 2021. "Data Integration by Combining Big Data and Survey Sample Data for Finite Population Inference," International Statistical Review, International Statistical Institute, vol. 89(2), pages 382-401, August.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ieva Burakauskaitė & Andrius Čiginas, 2023. "An Approach to Integrating a Non-Probability Sample in the Population Census," Mathematics, MDPI, vol. 11(8), pages 1-14, April.
    2. Chien-Min Huang & F. Jay Breidt, 2023. "A dual-frame approach for estimation with respondent-driven samples," METRON, Springer;Sapienza Università di Roma, vol. 81(1), pages 65-81, April.
    3. Camilla Salvatore, 2023. "Inference with non-probability samples and survey data integration: a science mapping study," METRON, Springer;Sapienza Università di Roma, vol. 81(1), pages 83-107, April.

    More about this item

    Keywords

    cosmetic estimator; dual-frame; GREG estimator; non-probability sample; prob-ability sample;
    All these keywords.

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:tse:wpaper:127048. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: https://edirc.repec.org/data/tsetofr.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.