IDEAS home Printed from https://ideas.repec.org/a/bpj/ijbist/v4y2008i1n17.html

Estimation Based on Case-Control Designs with Known Prevalence Probability

Author

Listed:
  • van der Laan Mark J.

    (University of California, Berkeley)

Abstract

Regular case-control sampling is an extremely common design used to generate data to estimate effects of exposures or treatments on a binary outcome of interest when the proportion of cases (i.e., binary outcome equal to 1) in the population of interest is low. Case-control sampling represents a biased sample of a target population of interest by sampling a disproportional number of cases. Case-control studies are also commonly employed to estimate the effects of genetic markers or biomarkers on binary phenotypes.In this article we present a general method of estimation relying on knowing the prevalence probability, conditional on the matching variable if matching is used.Our general proposed methodology, involving a simple weighting scheme of cases and controls, maps any estimation method for a parameter developed for prospective sampling from the population of interest into an estimation method based on case-control sampling from this population.We show that this case-control weighting of an efficient estimator for a prospective sample from the target population of interest maps into an efficient estimator for matched and unmatched case-control sampling. In particular, we show how application of this generic methodology provides us with double robust locally efficient targeted maximum likelihood estimators of the causal relative risk and causal odds ratio for regular case control sampling and matched case control sampling.Various extensions and generalizations of our methods are discussed.

Suggested Citation

  • van der Laan Mark J., 2008. "Estimation Based on Case-Control Designs with Known Prevalence Probability," The International Journal of Biostatistics, De Gruyter, vol. 4(1), pages 1-59, September.
  • Handle: RePEc:bpj:ijbist:v:4:y:2008:i:1:n:17
    DOI: 10.2202/1557-4679.1114
    as

    Download full text from publisher

    File URL: https://doi.org/10.2202/1557-4679.1114
    Download Restriction: For access to full text, subscription to the journal or payment for the individual article is required.

    File URL: https://libkey.io/10.2202/1557-4679.1114?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Anthony P. Morise & George A. Diamond & Robert Detrano & Marco Bobbio & Erdogan Gunel, 1996. "The Effect of Disease-prevalence Adjustments on the Accuracy of a Logistic Prediction Model," Medical Decision Making, , vol. 16(2), pages 133-142, June.
    2. Cosslett, Stephen R, 1981. "Maximum Likelihood Estimator for Choice-Based Samples," Econometrica, Econometric Society, vol. 49(5), pages 1289-1316, September.
    3. Manski, Charles F & Lerman, Steven R, 1977. "The Estimation of Choice Probabilities from Choice Based Samples," Econometrica, Econometric Society, vol. 45(8), pages 1977-1988, November.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. van der Laan Mark J., 2010. "Targeted Maximum Likelihood Based Causal Inference: Part I," The International Journal of Biostatistics, De Gruyter, vol. 6(2), pages 1-45, February.
    2. van der Laan Mark J., 2014. "Targeted Estimation of Nuisance Parameters to Obtain Valid Statistical Inference," The International Journal of Biostatistics, De Gruyter, vol. 10(1), pages 29-57, May.
    3. van der Laan Mark J., 2014. "Causal Inference for a Population of Causally Connected Units," Journal of Causal Inference, De Gruyter, vol. 2(1), pages 13-74, March.
    4. Wei Zhao & Ying Qing Chen & Li Hsu, 2017. "On estimation of time-dependent attributable fraction from population-based case-control studies," Biometrics, The International Biometric Society, vol. 73(3), pages 866-875, September.
    5. Sjolander Arvid & Vansteelandt Stijn & Humphreys Keith, 2010. "A Principal Stratification Approach to Assess the Differences in Prognosis between Cancers Caused by Hormone Replacement Therapy and by Other Factors," The International Journal of Biostatistics, De Gruyter, vol. 6(1), pages 1-37, June.
    6. Brooks Jordan C. & van der Laan Mark J. & Singer Daniel E. & Go Alan S., 2013. "Targeted Minimum Loss-Based Estimation of Causal Effects in Right-Censored Survival Data with Time-Dependent Covariates: Warfarin, Stroke, and Death in Atrial Fibrillation," Journal of Causal Inference, De Gruyter, vol. 1(2), pages 235-254, November.
    7. van der Laan Mark J. & Gruber Susan, 2012. "Targeted Minimum Loss Based Estimation of Causal Effects of Multiple Time Point Interventions," The International Journal of Biostatistics, De Gruyter, vol. 8(1), pages 1-41, May.
    8. Brooks Jordan & van der Laan Mark J. & Go Alan S., 2012. "Targeted Maximum Likelihood Estimation for Prediction Calibration," The International Journal of Biostatistics, De Gruyter, vol. 8(1), pages 1-35, October.
    9. Amanda Coston & Edward H. Kennedy, 2022. "The role of the geometric mean in case-control studies," Papers 2207.09016, arXiv.org.
    10. Rose Sherri & van der Laan Mark J., 2008. "Simple Optimal Weighting of Cases and Controls in Case-Control Studies," The International Journal of Biostatistics, De Gruyter, vol. 4(1), pages 1-26, September.
    11. van der Laan Mark J. & Petersen Maya & Zheng Wenjing, 2013. "Estimating the Effect of a Community-Based Intervention with Two Communities," Journal of Causal Inference, De Gruyter, vol. 1(1), pages 83-106, June.
    12. Adel Hussein Elduma & Kourosh Holakouie-Naieni & Amir Almasi-Hashiani & Abbas Rahimi Foroushani & Hamdan Mustafa Hamdan Ali & Muatsim Ahmed Mohammed Adam & Asma Elsony & Mohammad Ali Mansournia, 2023. "The Targeted Maximum Likelihood estimation to estimate the causal effects of the previous tuberculosis treatment in Multidrug-resistant tuberculosis in Sudan," PLOS ONE, Public Library of Science, vol. 18(1), pages 1-12, January.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Esmeralda Ramalho, 2004. "Covariate Measurement Error in Endogenous Stratified Samples," Economics Working Papers 2_2004, University of Évora, Department of Economics (Portugal).
    2. Steven Berry & James Levinsohn & Ariel Pakes, 2004. "Differentiated Products Demand Systems from a Combination of Micro and Macro Data: The New Car Market," Journal of Political Economy, University of Chicago Press, vol. 112(1), pages 68-105, February.
    3. Esmerelda A. Ramalho & Richard Smith, 2003. "Discrete choice non-response," CeMMAP working papers 07/03, Institute for Fiscal Studies.
    4. Imbens, Guido W. & Lancaster, Tony, 1996. "Efficient estimation and stratified sampling," Journal of Econometrics, Elsevier, vol. 74(2), pages 289-318, October.
    5. Daniel McFadden, 2001. "Economic Choices," American Economic Review, American Economic Association, vol. 91(3), pages 351-378, June.
    6. Prokhorov, Artem & Schmidt, Peter, 2009. "GMM redundancy results for general missing data problems," Journal of Econometrics, Elsevier, vol. 151(1), pages 47-55, July.
    7. Norman E. Breslow, 2003. "Are Statistical Contributions to Medicine Undervalued?," Biometrics, The International Biometric Society, vol. 59(1), pages 1-8, March.
    8. Ramalho Esmeralda A., 2010. "Covariate Measurement Error: Bias Reduction under Response-Based Sampling," Studies in Nonlinear Dynamics & Econometrics, De Gruyter, vol. 14(4), pages 1-34, September.
    9. Abay, Kibrom A., 2015. "Investigating the nature and impact of reporting bias in road crash data," Transportation Research Part A: Policy and Practice, Elsevier, vol. 71(C), pages 31-45.
    10. Robbennolt, Dale & Pendyala, Ram M. & Bhat, Chandra R., 2026. "Data collection, weighting, and modeling techniques to estimate consistent population parameters," Transportation Research Part B: Methodological, Elsevier, vol. 203(C).
    11. Nevo, Aviv, 2003. "Using Weights to Adjust for Sample Selection When Auxiliary Information Is Available," Journal of Business & Economic Statistics, American Statistical Association, vol. 21(1), pages 43-52, January.
    12. Hindsley, Paul & Landry, Craig E. & Gentner, Brad, 2011. "Addressing onsite sampling in recreation site choice models," Journal of Environmental Economics and Management, Elsevier, vol. 62(1), pages 95-110, July.
    13. Bhattacharya, Debopam, 2005. "Asymptotic inference from multi-stage samples," Journal of Econometrics, Elsevier, vol. 126(1), pages 145-171, May.
    14. Büchel, Konstantin & Ehrlich, Maximilian V. & Puga, Diego & Viladecans-Marsal, Elisabet, 2020. "Calling from the outside: The role of networks in residential mobility," Journal of Urban Economics, Elsevier, vol. 119(C).
    15. Büchel, Konstantin & Ehrlich, Maximilian v., 2020. "Cities and the structure of social interactions: Evidence from mobile phone data," Journal of Urban Economics, Elsevier, vol. 119(C).
    16. Butler, J. S., 2000. "Efficiency results of MLE and GMM estimation with sampling weights," Journal of Econometrics, Elsevier, vol. 96(1), pages 25-37, May.
    17. Kyungchul Song, 2009. "Efficient Estimation of Average Treatment Effects under Treatment-Based Sampling," PIER Working Paper Archive 09-011, Penn Institute for Economic Research, Department of Economics, University of Pennsylvania.
    18. Tripathi, Gautam, 2011. "Generalized method of moments (GMM) based inference with stratified samples when the aggregate shares are known," Journal of Econometrics, Elsevier, vol. 165(2), pages 258-265.
    19. Ramalho, Esmeralda A., 2002. "Regression models for choice-based samples with misclassification in the response variable," Journal of Econometrics, Elsevier, vol. 106(1), pages 171-201, January.
    20. Kitamura, Ryuichi & Yamamoto, Toshiyuki & Sakai, Hiromu, 2003. "A methodology for weighting observations from complex endogenous sampling," Transportation Research Part B: Methodological, Elsevier, vol. 37(4), pages 387-401, May.

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;
    ;
    ;
    ;
    ;
    ;
    ;
    ;
    ;
    ;
    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bpj:ijbist:v:4:y:2008:i:1:n:17. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.degruyterbrill.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.