Christina Kendziorski (Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison) Rafael Irizarry (Johns Hopkins Bloomberg School of Public Health) K. Chen (McArdle Laboratory for Cancer Research, University of Wisconsin-Madison) J.D. Haag (McArdle Laboratory for Cancer Research, University of Wisconsin-Madison) M.N. Gould (McArdle Laboratory for Cancer Research, University of Wisconsin-Madison)
Abstract
Over 10% of the data sets catalogued in the Gene Expression Omnibus Database involve messenger RNA samples that have been pooled prior to hybridization. Pooling affects data quality and inference, but the exact effects are not yet known as pooling has not been systematically studied in the context of microarray experiments. Here we report on the results of an experiment designed to evaluate the utility of pooling and the impact on identifying differentially expressed genes. We find that inference for most genes is not adversely affected by pooling and we recommend that pooling be done when fewer than three arrays are used in each condition. For larger designs, pooling does not significantly improve inferences if few subjects are pooled. The realized benefits in this case do not outweigh the price paid for loss of individual specific information. Pooling is beneficial when many subjects are pooled, provided independent samples contribute to multiple pools.
Download Info
To download:
If you experience problems downloading a file, check if you have the
proper application to
view it first. Information about this may be contained
in the File-Format links below. In case of further problems read
the IDEAS help
page. Note that these files are not on the IDEAS
site. Please be patient as the files may be large.