IDEAS home Printed from
   My bibliography  Save this article

Nonparametric Estimation for Censored Mixture Data With Application to the Cooperative Huntington’s Observational Research Trial


  • Yuanjia Wang
  • Tanya P. Garcia
  • Yanyuan Ma


This work presents methods for estimating genotype-specific outcome distributions from genetic epidemiology studies where the event times are subject to right censoring, the genotypes are not directly observed, and the data arise from a mixture of scientifically meaningful subpopulations. Examples of such studies include kin-cohort studies and quantitative trait locus (QTL) studies. Current methods for analyzing censored mixture data include two types of nonparametric maximum likelihood estimators (NPMLEs; Type I and Type II) that do not make parametric assumptions on the genotype-specific density functions. Although both NPMLEs are commonly used, we show that one is inefficient and the other inconsistent. To overcome these deficiencies, we propose three classes of consistent nonparametric estimators that do not assume parametric density models and are easy to implement. They are based on inverse probability weighting (IPW), augmented IPW (AIPW), and nonparametric imputation (IMP). AIPW achieves the efficiency bound without additional modeling assumptions. Extensive simulation experiments demonstrate satisfactory performance of these estimators even when the data are heavily censored. We apply these estimators to the Cooperative Huntington’s Observational Research Trial (COHORT), and provide age-specific estimates of the effect of mutation in the Huntington gene on mortality using a sample of family members. The close approximation of the estimated noncarrier survival rates to that of the U.S. population indicates small ascertainment bias in the COHORT family sample. Our analyses underscore an elevated risk of death in Huntington gene mutation carriers compared with that in noncarriers for a wide age range, and suggest that the mutation equally affects survival rates in both genders. The estimated survival rates are useful in genetic counseling for providing guidelines on interpreting the risk of death associated with a positive genetic test, and in helping future subjects at risk to make informed decisions on whether to undergo genetic mutation testing. Technical details and additional numerical results are provided in the online supplementary materials.

Suggested Citation

  • Yuanjia Wang & Tanya P. Garcia & Yanyuan Ma, 2012. "Nonparametric Estimation for Censored Mixture Data With Application to the Cooperative Huntington’s Observational Research Trial," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(500), pages 1324-1338, December.
  • Handle: RePEc:taf:jnlasa:v:107:y:2012:i:500:p:1324-1338
    DOI: 10.1080/01621459.2012.699353

    Download full text from publisher

    File URL:
    Download Restriction: Access to full text is restricted to subscribers.

    As the access to this document is restricted, you may want to search for a different version of it.


    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

    Cited by:

    1. repec:bla:scjsta:v:44:y:2017:i:1:p:112-129 is not listed on IDEAS
    2. Xie, Shangyu & Wan, Alan T.K. & Zhou, Yong, 2015. "Quantile regression methods with varying-coefficient models for censored data," Computational Statistics & Data Analysis, Elsevier, vol. 88(C), pages 154-172.
    3. Yanyuan Ma & Yuanjia Wang, 2014. "Estimating disease onset distribution functions in mutation carriers with censored mixture data," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 63(1), pages 1-23, January.
    4. repec:bla:jorssc:v:66:y:2017:i:4:p:833-846 is not listed on IDEAS

    More about this item


    Access and download statistics


    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:taf:jnlasa:v:107:y:2012:i:500:p:1324-1338. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Chris Longhurst). General contact details of provider: .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.