IDEAS home Printed from https://ideas.repec.org/a/bla/jorssb/v71y2009i2p425-445.html
   My bibliography  Save this article

Variance estimation in the analysis of microarray data

Author

Listed:
  • Yuedong Wang
  • Yanyuan Ma
  • Raymond J. Carroll

Abstract

Summary. Microarrays are one of the most widely used high throughput technologies. One of the main problems in the area is that conventional estimates of the variances that are required in the t‐statistic and other statistics are unreliable owing to the small number of replications. Various methods have been proposed in the literature to overcome this lack of degrees of freedom problem. In this context, it is commonly observed that the variance increases proportionally with the intensity level, which has led many researchers to assume that the variance is a function of the mean. Here we concentrate on estimation of the variance as a function of an unknown mean in two models: the constant coefficient of variation model and the quadratic variance–mean model. Because the means are unknown and estimated with few degrees of freedom, naive methods that use the sample mean in place of the true mean are generally biased because of the errors‐in‐variables phenomenon. We propose three methods for overcoming this bias. The first two are variations on the theme of the so‐called heteroscedastic simulation–extrapolation estimator, modified to estimate the variance function consistently. The third class of estimators is entirely different, being based on semiparametric information calculations. Simulations show the power of our methods and their lack of bias compared with the naive method that ignores the measurement error. The methodology is illustrated by using microarray data from leukaemia patients.

Suggested Citation

  • Yuedong Wang & Yanyuan Ma & Raymond J. Carroll, 2009. "Variance estimation in the analysis of microarray data," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(2), pages 425-445, April.
  • Handle: RePEc:bla:jorssb:v:71:y:2009:i:2:p:425-445
    DOI: 10.1111/j.1467-9868.2008.00690.x
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/j.1467-9868.2008.00690.x
    Download Restriction: no

    File URL: https://libkey.io/10.1111/j.1467-9868.2008.00690.x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Newey, Whitney K, 1990. "Semiparametric Efficiency Bounds," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 5(2), pages 99-135, April-Jun.
    2. Danh V. Nguyen & A. Bulak Arpat & Naisyin Wang & Raymond J. Carroll, 2002. "DNA Microarray Experiments: Biological and Technological Aspects," Biometrics, The International Biometric Society, vol. 58(4), pages 701-717, December.
    3. Anastasios A. Tsiatis & Yanyuan Ma, 2004. "Locally efficient semiparametric estimators for functional measurement error models," Biometrika, Biometrika Trust, vol. 91(4), pages 835-848, December.
    4. Ma, Yanyuan & Genton, Marc G. & Tsiatis, Anastasios A., 2005. "Locally Efficient Semiparametric Estimators for Generalized Skew-Elliptical Distributions," Journal of the American Statistical Association, American Statistical Association, vol. 100, pages 980-989, September.
    5. Devanarayan, Viswanath & Stefanski, Leonard A., 2002. "Empirical simulation extrapolation for measurement error models with replicate measurements," Statistics & Probability Letters, Elsevier, vol. 59(3), pages 219-225, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Michela Battauz & Ruggero Bellio, 2011. "Structural Modeling of Measurement Error in Generalized Linear Models with Rasch Measures as Covariates," Psychometrika, Springer;The Psychometric Society, vol. 76(1), pages 40-56, January.
    2. Aurore Delaigle & Peter Hall, 2016. "Methodology for non-parametric deconvolution when the error distribution is unknown," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 78(1), pages 231-252, January.
    3. Yun Fang & Li-Xing Zhu, 2012. "Asymptotics of SIMEX-based variance estimation," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 75(3), pages 329-345, April.
    4. Garcia, Tanya P. & Ma, Yanyuan, 2017. "Simultaneous treatment of unspecified heteroskedastic model error distribution and mismeasured covariates for restricted moment models," Journal of Econometrics, Elsevier, vol. 200(2), pages 194-206.
    5. Dazard, Jean-Eudes & Sunil Rao, J., 2012. "Joint adaptive mean–variance regularization and variance stabilization of high dimensional data," Computational Statistics & Data Analysis, Elsevier, vol. 56(7), pages 2317-2333.
    6. Xiao Min & Chen Ting & Ming Ruixing & Huang Kunpeng, 2020. "Optimal Estimation for Power of Variance with Application to Gene-Set Testing," Journal of Systems Science and Information, De Gruyter, vol. 8(6), pages 549-564, December.
    7. Andrew L. Rukhin, 2017. "Estimation of the common mean from heterogeneous normal observations with unknown variances," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(5), pages 1601-1618, November.
    8. J. R. Lockwood & Daniel F. McCaffrey, 2017. "Simulation-Extrapolation with Latent Heteroskedastic Error Variance," Psychometrika, Springer;The Psychometric Society, vol. 82(3), pages 717-736, September.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yanyuan Ma & Marc G. Genton, 2010. "Explicit estimating equations for semiparametric generalized linear latent variable models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 72(4), pages 475-495, September.
    2. Li, Mengyan & Ma, Yanyuan & Li, Runze, 2019. "Semiparametric regression for measurement error model with heteroscedastic error," Journal of Multivariate Analysis, Elsevier, vol. 171(C), pages 320-338.
    3. Yanyuan Ma & Jeffrey D. Hart & Ryan Janicki & Raymond J. Carroll, 2011. "Local and omnibus goodness‐of‐fit tests in classical measurement error models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 73(1), pages 81-98, January.
    4. Mijeong Kim & Yanyuan Ma, 2012. "The efficiency of the second-order nonlinear least squares estimator and its extension," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 64(4), pages 751-764, August.
    5. Bo Zhang & Eric J. Tchetgen Tchetgen, 2022. "A semi‐parametric approach to model‐based sensitivity analysis in observational studies," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(S2), pages 668-691, December.
    6. Xiaohong Chen & Andres Santos, 2018. "Overidentification in Regular Models," Econometrica, Econometric Society, vol. 86(5), pages 1771-1817, September.
    7. Ichimura, Hidehiko & Todd, Petra E., 2007. "Implementing Nonparametric and Semiparametric Estimators," Handbook of Econometrics, in: J.J. Heckman & E.E. Leamer (ed.), Handbook of Econometrics, edition 1, volume 6, chapter 74, Elsevier.
    8. Sant’Anna, Pedro H.C. & Zhao, Jun, 2020. "Doubly robust difference-in-differences estimators," Journal of Econometrics, Elsevier, vol. 219(1), pages 101-122.
    9. Inanoglu, Hulusi & Jacobs, Michael, Jr. & Liu, Junrong & Sickles, Robin, 2015. "Analyzing Bank Efficiency: Are "Too-Big-to-Fail" Banks Efficient?," Working Papers 15-016, Rice University, Department of Economics.
    10. Parrish, Rudolph S. & Spencer III, Horace J. & Xu, Ping, 2009. "Distribution modeling and simulation of gene expression data," Computational Statistics & Data Analysis, Elsevier, vol. 53(5), pages 1650-1660, March.
    11. Sung Jae Jun & Sokbae Lee, 2020. "Causal Inference under Outcome-Based Sampling with Monotonicity Assumptions," Papers 2004.08318, arXiv.org, revised Oct 2023.
    12. Li-Xuan Qin & Steven G. Self, 2006. "The Clustering of Regression Models Method with Applications in Gene Expression Data," Biometrics, The International Biometric Society, vol. 62(2), pages 526-533, June.
    13. Linton, Oliver, 1995. "Second Order Approximation in the Partially Linear Regression Model," Econometrica, Econometric Society, vol. 63(5), pages 1079-1112, September.
    14. Oliver Linton & Pedro Gozalo, 1996. "Conditional Independence Restrictions: Testing and Estimation," Cowles Foundation Discussion Papers 1140, Cowles Foundation for Research in Economics, Yale University.
    15. Chen, Xiaohong & Pouzo, Demian, 2009. "Efficient estimation of semiparametric conditional moment models with possibly nonsmooth residuals," Journal of Econometrics, Elsevier, vol. 152(1), pages 46-60, September.
    16. Lauber, Verena & Thomas, Lampert, 2014. "The Effect of Early Universal Daycare on Child Weight Problems," VfS Annual Conference 2014 (Hamburg): Evidence-based Economic Policy 100399, Verein für Socialpolitik / German Economic Association.
    17. Alberto Abadie, 2000. "Semiparametric Estimation of Instrumental Variable Models for Causal Effects," NBER Technical Working Papers 0260, National Bureau of Economic Research, Inc.
    18. Waverly Wei & Maya Petersen & Mark J van der Laan & Zeyu Zheng & Chong Wu & Jingshen Wang, 2023. "Efficient targeted learning of heterogeneous treatment effects for multiple subgroups," Biometrics, The International Biometric Society, vol. 79(3), pages 1934-1946, September.
    19. Timothy B. Armstrong & Michal Kolesár, 2021. "Sensitivity analysis using approximate moment condition models," Quantitative Economics, Econometric Society, vol. 12(1), pages 77-108, January.
    20. Drost, F.C. & Klaasens, C.A.J. & Werker, B.J.M., 1994. "Adaptive Estimation in Time Series Models," Papers 9488, Tilburg - Center for Economic Research.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jorssb:v:71:y:2009:i:2:p:425-445. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/rssssea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.