IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0004495.html
   My bibliography  Save this article

Query Large Scale Microarray Compendium Datasets Using a Model-Based Bayesian Approach with Variable Selection

Author

Listed:
  • Ming Hu
  • Zhaohui S Qin

Abstract

In microarray gene expression data analysis, it is often of interest to identify genes that share similar expression profiles with a particular gene such as a key regulatory protein. Multiple studies have been conducted using various correlation measures to identify co-expressed genes. While working well for small datasets, the heterogeneity introduced from increased sample size inevitably reduces the sensitivity and specificity of these approaches. This is because most co-expression relationships do not extend to all experimental conditions. With the rapid increase in the size of microarray datasets, identifying functionally related genes from large and diverse microarray gene expression datasets is a key challenge. We develop a model-based gene expression query algorithm built under the Bayesian model selection framework. It is capable of detecting co-expression profiles under a subset of samples/experimental conditions. In addition, it allows linearly transformed expression patterns to be recognized and is robust against sporadic outliers in the data. Both features are critically important for increasing the power of identifying co-expressed genes in large scale gene expression datasets. Our simulation studies suggest that this method outperforms existing correlation coefficients or mutual information-based query tools. When we apply this new method to the Escherichia coli microarray compendium data, it identifies a majority of known regulons as well as novel potential target genes of numerous key transcription factors.

Suggested Citation

  • Ming Hu & Zhaohui S Qin, 2009. "Query Large Scale Microarray Compendium Datasets Using a Model-Based Bayesian Approach with Variable Selection," PLOS ONE, Public Library of Science, vol. 4(2), pages 1-11, February.
  • Handle: RePEc:plo:pone00:0004495
    DOI: 10.1371/journal.pone.0004495
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0004495
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0004495&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0004495?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Jeremiah J Faith & Boris Hayete & Joshua T Thaden & Ilaria Mogno & Jamey Wierzbowski & Guillaume Cottarel & Simon Kasif & James J Collins & Timothy S Gardner, 2007. "Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles," PLOS Biology, Public Library of Science, vol. 5(1), pages 1-13, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Hossein Zare & Mostafa Kaveh & Arkady Khodursky, 2011. "Inferring a Transcriptional Regulatory Network from Gene Expression Data Using Nonlinear Manifold Embedding," PLOS ONE, Public Library of Science, vol. 6(8), pages 1-7, August.
    2. Diambra, L., 2011. "Coarse-grain reconstruction of genetic networks from expression levels," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 390(11), pages 2198-2207.
    3. Marco Grimaldi & Roberto Visintainer & Giuseppe Jurman, 2011. "RegnANN: Reverse Engineering Gene Networks Using Artificial Neural Networks," PLOS ONE, Public Library of Science, vol. 6(12), pages 1-19, December.
    4. Ruonan Wu & Michelle R. Davison & William C. Nelson & Montana L. Smith & Mary S. Lipton & Janet K. Jansson & Ryan S. McClure & Jason E. McDermott & Kirsten S. Hofmockel, 2023. "Hi-C metagenome sequencing reveals soil phage–host interactions," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    5. repec:jss:jstsof:37:i01 is not listed on IDEAS
    6. Joeri Ruyssinck & Vân Anh Huynh-Thu & Pierre Geurts & Tom Dhaene & Piet Demeester & Yvan Saeys, 2014. "NIMEFI: Gene Regulatory Network Inference using Multiple Ensemble Feature Importance Algorithms," PLOS ONE, Public Library of Science, vol. 9(3), pages 1-13, March.
    7. Tom Wilderjans & Dirk Depril & Iven Van Mechelen, 2013. "Additive Biclustering: A Comparison of One New and Two Existing ALS Algorithms," Journal of Classification, Springer;The Classification Society, vol. 30(1), pages 56-74, April.
    8. Shuhei Kimura & Masanao Sato & Mariko Okada-Hatakeyama, 2013. "Inference of Vohradský's Models of Genetic Networks by Solving Two-Dimensional Function Optimization Problems," PLOS ONE, Public Library of Science, vol. 8(12), pages 1-11, December.
    9. Xiaomeng Zhang & Bin Shao & Yangle Wu & Ouyang Qi, 2013. "A Reverse Engineering Approach to Optimize Experiments for the Construction of Biological Regulatory Networks," PLOS ONE, Public Library of Science, vol. 8(9), pages 1-9, September.
    10. Takanori Hasegawa & Rui Yamaguchi & Masao Nagasaki & Satoru Miyano & Seiya Imoto, 2014. "Inference of Gene Regulatory Networks Incorporating Multi-Source Biological Knowledge via a State Space Model with L1 Regularization," PLOS ONE, Public Library of Science, vol. 9(8), pages 1-19, August.
    11. Kannan Venkateshan & Tegner Jesper, 2016. "Adaptive input data transformation for improved network reconstruction with information theoretic algorithms," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 15(6), pages 507-520, December.
    12. Fei Liu & Shao-Wu Zhang & Wei-Feng Guo & Ze-Gang Wei & Luonan Chen, 2016. "Inference of Gene Regulatory Network Based on Local Bayesian Networks," PLOS Computational Biology, Public Library of Science, vol. 12(8), pages 1-17, August.
    13. Hirose, Kei & Fujisawa, Hironori & Sese, Jun, 2017. "Robust sparse Gaussian graphical modeling," Journal of Multivariate Analysis, Elsevier, vol. 161(C), pages 172-190.
    14. Benafsh Husain & F Alex Feltus, 2019. "EdgeScaping: Mapping the spatial distribution of pairwise gene expression intensities," PLOS ONE, Public Library of Science, vol. 14(8), pages 1-15, August.
    15. Zhen Yang & Yen‐Yi Ho, 2022. "Modeling dynamic correlation in zero‐inflated bivariate count data with applications to single‐cell RNA sequencing data," Biometrics, The International Biometric Society, vol. 78(2), pages 766-776, June.
    16. Mingyi Wang & Jerome Verdier & Vagner A Benedito & Yuhong Tang & Jeremy D Murray & Yinbing Ge & Jörg D Becker & Helena Carvalho & Christian Rogers & Michael Udvardi & Ji He, 2013. "LegumeGRN: A Gene Regulatory Network Prediction Server for Functional and Comparative Studies," PLOS ONE, Public Library of Science, vol. 8(7), pages 1-7, July.
    17. Scott Christley & Qing Nie & Xiaohui Xie, 2009. "Incorporating Existing Network Information into Gene Network Inference," PLOS ONE, Public Library of Science, vol. 4(8), pages 1-13, August.
    18. Maghsoodi, Masoume, 2016. "A New Method to Build Gene Regulation Network Based on Fuzzy Hierarchical Clustering Methods," MPRA Paper 79743, University Library of Munich, Germany.
    19. Ambroise Jérôme & Robert Annie & Macq Benoit & Gala Jean-Luc, 2012. "Transcriptional Network Inference from Functional Similarity and Expression Data: A Global Supervised Approach," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(1), pages 1-24, January.
    20. Shiori Sagawa & Morgan N Price & Adam M Deutschbauer & Adam P Arkin, 2017. "Validating regulatory predictions from diverse bacteria with mutant fitness data," PLOS ONE, Public Library of Science, vol. 12(5), pages 1-14, May.
    21. Guibo Ye & Mengfan Tang & Jian-Feng Cai & Qing Nie & Xiaohui Xie, 2013. "Low-Rank Regularization for Learning Gene Expression Programs," PLOS ONE, Public Library of Science, vol. 8(12), pages 1-9, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0004495. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.