IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1002340.html
   My bibliography  Save this article

Genetic Co-Occurrence Network across Sequenced Microbes

Author

Listed:
  • Pan-Jun Kim
  • Nathan D Price

Abstract

The phenotype of any organism on earth is, in large part, the consequence of interplay between numerous gene products encoded in the genome, and such interplay between gene products affects the evolutionary fate of the genome itself through the resulting phenotype. In this regard, contemporary genomes can be used as molecular records that reveal associations of various genes working in their natural lifestyles. By analyzing thousands of orthologs across ∼600 bacterial species, we constructed a map of gene-gene co-occurrence across much of the sequenced biome. If genes preferentially co-occur in the same organisms, they were called herein correlogs; in the opposite case, called anti-correlogs. To quantify correlogy and anti-correlogy, we alleviated the contribution of indirect correlations between genes by adapting ideas developed for reverse engineering of transcriptional regulatory networks. Resultant correlogous associations are highly enriched for physically interacting proteins and for co-expressed transcripts, clearly differentiating a subgroup of functionally-obligatory protein interactions from conditional or transient interactions. Other biochemical and phylogenetic properties were also found to be reflected in correlogous and anti-correlogous relationships. Additionally, our study elucidates the global organization of the gene association map, in which various modules of correlogous genes are strikingly interconnected by anti-correlogous crosstalk between the modules. We then demonstrate the effectiveness of such associations along different domains of life and environmental microbial communities. These phylogenetic profiling approaches infer functional coupling of genes regardless of mechanistic details, and may be useful to guide exogenous gene import in synthetic biology. Author Summary: Genes in organisms have a number of interactions with one another in their biological contexts. For example, proteins produced from one gene may interact with other proteins produced from another gene to perform together a particular biological task, and such pairs of cooperative genes may often reside together in the same organisms. We analyzed thousands of genes across ∼600 bacterial species, and found genes with favored co-occurrence in the same organisms (termed correlogs) or disfavored co-occurrence (termed anti-correlogs). These co-occurrence patterns are significantly reflective of actual biochemical interplays between genes, and distinct cliques of correlogous genes are seamlessly interrelated through anti-correlogous links between the cliques. The ‘sociology’ of genes inferred by this approach provides useful information on how to engineer a cell, such as for production of a desired byproduct. For example, an important gene in cellobiose digestion for biofuel production, bglB, is suggested to function better in a cell factory when co-activated with another gene rhaM, the correlogous partner we found in our analysis.

Suggested Citation

  • Pan-Jun Kim & Nathan D Price, 2011. "Genetic Co-Occurrence Network across Sequenced Microbes," PLOS Computational Biology, Public Library of Science, vol. 7(12), pages 1-9, December.
  • Handle: RePEc:plo:pcbi00:1002340
    DOI: 10.1371/journal.pcbi.1002340
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002340
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1002340&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1002340?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Markus W. Covert & Eric M. Knight & Jennifer L. Reed & Markus J. Herrgard & Bernhard O. Palsson, 2004. "Integrating high-throughput and computational data elucidates bacterial networks," Nature, Nature, vol. 429(6987), pages 92-96, May.
    2. Jing-Dong J. Han & Nicolas Bertin & Tong Hao & Debra S. Goldberg & Gabriel F. Berriz & Lan V. Zhang & Denis Dupuy & Albertha J. M. Walhout & Michael E. Cusick & Frederick P. Roth & Marc Vidal, 2004. "Erratum: Evidence for dynamically organized modularity in the yeast protein–protein interaction network," Nature, Nature, vol. 430(6997), pages 380-380, July.
    3. Jing-Dong J. Han & Nicolas Bertin & Tong Hao & Debra S. Goldberg & Gabriel F. Berriz & Lan V. Zhang & Denis Dupuy & Albertha J. M. Walhout & Michael E. Cusick & Frederick P. Roth & Marc Vidal, 2004. "Evidence for dynamically organized modularity in the yeast protein–protein interaction network," Nature, Nature, vol. 430(6995), pages 88-93, July.
    4. Schäfer Juliane & Strimmer Korbinian, 2005. "A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 4(1), pages 1-32, November.
    5. Ledoit, Olivier & Wolf, Michael, 2003. "Improved estimation of the covariance matrix of stock returns with an application to portfolio selection," Journal of Empirical Finance, Elsevier, vol. 10(5), pages 603-621, December.
    6. Niels-Ulrik Frigaard & Asuncion Martinez & Tracy J. Mincer & Edward F. DeLong, 2006. "Proteorhodopsin lateral gene transfer between marine planktonic Bacteria and Archaea," Nature, Nature, vol. 439(7078), pages 847-850, February.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Wade, M.J. & Harmand, J. & Benyahia, B. & Bouchez, T. & Chaillou, S. & Cloez, B. & Godon, J.-J. & Moussa Boudjemaa, B. & Rapaport, A. & Sari, T. & Arditi, R. & Lobry, C., 2016. "Perspectives in mathematical modelling for microbial ecology," Ecological Modelling, Elsevier, vol. 321(C), pages 64-74.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Hannart, Alexis & Naveau, Philippe, 2014. "Estimating high dimensional covariance matrices: A new look at the Gaussian conjugate framework," Journal of Multivariate Analysis, Elsevier, vol. 131(C), pages 149-162.
    2. Jianqing Fan & Xu Han, 2017. "Estimation of the false discovery proportion with unknown dependence," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(4), pages 1143-1164, September.
    3. Wang, Christina Dan & Chen, Zhao & Lian, Yimin & Chen, Min, 2022. "Asset selection based on high frequency Sharpe ratio," Journal of Econometrics, Elsevier, vol. 227(1), pages 168-188.
    4. Viet Anh Nguyen & Daniel Kuhn & Peyman Mohajerin Esfahani, 2018. "Distributionally Robust Inverse Covariance Estimation: The Wasserstein Shrinkage Estimator," Papers 1805.07194, arXiv.org.
    5. Helmut Lütkepohl & Anna Staszewska-Bystrova & Peter Winker, 2018. "Calculating joint confidence bands for impulse response functions using highest density regions," Empirical Economics, Springer, vol. 55(4), pages 1389-1411, December.
    6. Franke, R., 2016. "CHIMERA: Top-down model for hierarchical, overlapping and directed cluster structures in directed and weighted complex networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 461(C), pages 384-408.
    7. Ledoit, Olivier & Wolf, Michael, 2017. "Numerical implementation of the QuEST function," Computational Statistics & Data Analysis, Elsevier, vol. 115(C), pages 199-223.
    8. Patrick C F Buchholz & Catharina Zeil & Jürgen Pleiss, 2018. "The scale-free nature of protein sequence space," PLOS ONE, Public Library of Science, vol. 13(8), pages 1-14, August.
    9. Couillet, Romain & McKay, Matthew, 2014. "Large dimensional analysis and optimization of robust shrinkage covariance matrix estimators," Journal of Multivariate Analysis, Elsevier, vol. 131(C), pages 99-120.
    10. Seyed Yahya Anvar & Allan Tucker & Veronica Vinciotti & Andrea Venema & Gert-Jan B van Ommen & Silvere M van der Maarel & Vered Raz & Peter A C ‘t Hoen, 2011. "Interspecies Translation of Disease Networks Increases Robustness and Predictive Accuracy," PLOS Computational Biology, Public Library of Science, vol. 7(11), pages 1-14, November.
    11. Tumminello, Michele & Lillo, Fabrizio & Mantegna, Rosario N., 2010. "Correlation, hierarchies, and networks in financial markets," Journal of Economic Behavior & Organization, Elsevier, vol. 75(1), pages 40-58, July.
    12. Hou, Bonan & Yao, Yiping & Liao, Dongsheng, 2012. "Identifying all-around nodes for spreading dynamics in complex networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 391(15), pages 4012-4017.
    13. Huang, Na & Fryzlewicz, Piotr, 2018. "NOVELIST estimator of large correlation and covariance matrices and their inverses," LSE Research Online Documents on Economics 89055, London School of Economics and Political Science, LSE Library.
    14. Bartosz Kaszuba, 2012. "Empirical Comparison of Robust Portfolios’ Investment Effects," The Review of Finance and Banking, Academia de Studii Economice din Bucuresti, Romania / Facultatea de Finante, Asigurari, Banci si Burse de Valori / Catedra de Finante, vol. 5(1), pages 047-061, June.
    15. Bailey, Natalia & Pesaran, M. Hashem & Smith, L. Vanessa, 2019. "A multiple testing approach to the regularisation of large sample correlation matrices," Journal of Econometrics, Elsevier, vol. 208(2), pages 507-534.
    16. Bergsteinsson, Hjörleifur G. & Møller, Jan Kloppenborg & Nystrup, Peter & Pálsson, Ólafur Pétur & Guericke, Daniela & Madsen, Henrik, 2021. "Heat load forecasting using adaptive temporal hierarchies," Applied Energy, Elsevier, vol. 292(C).
    17. Peter Langfelder & Paul S Mischel & Steve Horvath, 2013. "When Is Hub Gene Selection Better than Standard Meta-Analysis?," PLOS ONE, Public Library of Science, vol. 8(4), pages 1-16, April.
    18. Fisher, Thomas J. & Sun, Xiaoqian, 2011. "Improved Stein-type shrinkage estimators for the high-dimensional multivariate normal covariance matrix," Computational Statistics & Data Analysis, Elsevier, vol. 55(5), pages 1909-1918, May.
    19. Zhang, Yuerong & Marshall, Stephen & Manley, Ed, 2021. "Understanding the roles of rail stations: Insights from network approaches in the London metropolitan area," Journal of Transport Geography, Elsevier, vol. 94(C).
    20. Fabio Cumbo & Paola Paci & Daniele Santoni & Luisa Di Paola & Alessandro Giuliani, 2014. "GIANT: A Cytoscape Plugin for Modular Networks," PLOS ONE, Public Library of Science, vol. 9(10), pages 1-7, October.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1002340. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.