IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1006369.html
   My bibliography  Save this article

SILGGM: An extensive R package for efficient statistical inference in large-scale gene networks

Author

Listed:
  • Rong Zhang
  • Zhao Ren
  • Wei Chen

Abstract

Gene co-expression network analysis is extremely useful in interpreting a complex biological process. The recent droplet-based single-cell technology is able to generate much larger gene expression data routinely with thousands of samples and tens of thousands of genes. To analyze such a large-scale gene-gene network, remarkable progress has been made in rigorous statistical inference of high-dimensional Gaussian graphical model (GGM). These approaches provide a formal confidence interval or a p-value rather than only a single point estimator for conditional dependence of a gene pair and are more desirable for identifying reliable gene networks. To promote their widespread use, we herein introduce an extensive and efficient R package named SILGGM (Statistical Inference of Large-scale Gaussian Graphical Model) that includes four main approaches in statistical inference of high-dimensional GGM. Unlike the existing tools, SILGGM provides statistically efficient inference on both individual gene pair and whole-scale gene pairs. It has a novel and consistent false discovery rate (FDR) procedure in all four methodologies. Based on the user-friendly design, it provides outputs compatible with multiple platforms for interactive network visualization. Furthermore, comparisons in simulation illustrate that SILGGM can accelerate the existing MATLAB implementation to several orders of magnitudes and further improve the speed of the already very efficient R package FastGGM. Testing results from the simulated data confirm the validity of all the approaches in SILGGM even in a very large-scale setting with the number of variables or genes to a ten thousand level. We have also applied our package to a novel single-cell RNA-seq data set with pan T cells. The results show that the approaches in SILGGM significantly outperform the conventional ones in a biological sense. The package is freely available via CRAN at https://cran.r-project.org/package=SILGGM.

Suggested Citation

  • Rong Zhang & Zhao Ren & Wei Chen, 2018. "SILGGM: An extensive R package for efficient statistical inference in large-scale gene networks," PLOS Computational Biology, Public Library of Science, vol. 14(8), pages 1-14, August.
  • Handle: RePEc:plo:pcbi00:1006369
    DOI: 10.1371/journal.pcbi.1006369
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006369
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1006369&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1006369?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Andreas Gerasch & Daniel Faber & Jan Küntzer & Peter Niermann & Oliver Kohlbacher & Hans-Peter Lenhof & Michael Kaufmann, 2014. "BiNA: A Visual Analytics Tool for Biological Network Data," PLOS ONE, Public Library of Science, vol. 9(2), pages 1-11, February.
    2. Bochao Jia & Suwa Xu & Guanghua Xiao & Vishal Lamba & Faming Liang, 2017. "Learning gene regulatory networks from next generation sequencing data," Biometrics, The International Biometric Society, vol. 73(4), pages 1221-1230, December.
    3. Jana Janková & Sara Geer, 2017. "Honest confidence regions and optimality in high-dimensional precision matrix estimation," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 26(1), pages 143-162, March.
    4. Friedman, Jerome H. & Hastie, Trevor & Tibshirani, Rob, 2010. "Regularization Paths for Generalized Linear Models via Coordinate Descent," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i01).
    5. Ming Yuan & Yi Lin, 2007. "Model selection and estimation in the Gaussian graphical model," Biometrika, Biometrika Trust, vol. 94(1), pages 19-35.
    6. Ting Wang & Zhao Ren & Ying Ding & Zhou Fang & Zhe Sun & Matthew L MacDonald & Robert A Sweet & Jieru Wang & Wei Chen, 2016. "FastGGM: An Efficient Algorithm for the Inference of Gaussian Graphical Model in Biological Networks," PLOS Computational Biology, Public Library of Science, vol. 12(2), pages 1-16, February.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Zhou, Jia & Li, Yang & Zheng, Zemin & Li, Daoji, 2022. "Reproducible learning in large-scale graphical models," Journal of Multivariate Analysis, Elsevier, vol. 189(C).
    2. Shilu Zhang & Saptarshi Pyne & Stefan Pietrzak & Spencer Halberg & Sunnie Grace McCalla & Alireza Fotuhi Siahpirani & Rupa Sridharan & Sushmita Roy, 2023. "Inference of cell type-specific gene regulatory networks on cell lineages from single cell omic datasets," Nature Communications, Nature, vol. 14(1), pages 1-25, December.
    3. Jinzhou Li & Marloes H. Maathuis, 2021. "GGM knockoff filter: False discovery rate control for Gaussian graphical models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(3), pages 534-558, July.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Dong Liu & Changwei Zhao & Yong He & Lei Liu & Ying Guo & Xinsheng Zhang, 2023. "Simultaneous cluster structure learning and estimation of heterogeneous graphs for matrix‐variate fMRI data," Biometrics, The International Biometric Society, vol. 79(3), pages 2246-2259, September.
    2. Byol Kim & Song Liu & Mladen Kolar, 2021. "Two‐sample inference for high‐dimensional Markov networks," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(5), pages 939-962, November.
    3. S Klaassen & J Kueck & M Spindler & V Chernozhukov, 2023. "Uniform inference in high-dimensional Gaussian graphical models," Biometrika, Biometrika Trust, vol. 110(1), pages 51-68.
    4. Pei Wang & Shunjie Chen & Sijia Yang, 2022. "Recent Advances on Penalized Regression Models for Biological Data," Mathematics, MDPI, vol. 10(19), pages 1-24, October.
    5. Laurenţiu Cătălin Hinoveanu & Fabrizio Leisen & Cristiano Villa, 2020. "A loss‐based prior for Gaussian graphical models," Australian & New Zealand Journal of Statistics, Australian Statistical Publishing Association Inc., vol. 62(4), pages 444-466, December.
    6. Xiao Guo & Hai Zhang, 2020. "Sparse directed acyclic graphs incorporating the covariates," Statistical Papers, Springer, vol. 61(5), pages 2119-2148, October.
    7. Yang, Yuehan & Xia, Siwei & Yang, Hu, 2023. "Multivariate sparse Laplacian shrinkage for joint estimation of two graphical structures," Computational Statistics & Data Analysis, Elsevier, vol. 178(C).
    8. Rieser, Christopher & Filzmoser, Peter, 2023. "Extending compositional data analysis from a graph signal processing perspective," Journal of Multivariate Analysis, Elsevier, vol. 198(C).
    9. Murat Genç, 2022. "A new double-regularized regression using Liu and lasso regularization," Computational Statistics, Springer, vol. 37(1), pages 159-227, March.
    10. Runmin Shi & Faming Liang & Qifan Song & Ye Luo & Malay Ghosh, 2018. "A Blockwise Consistency Method for Parameter Estimation of Complex Models," Sankhya B: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 80(1), pages 179-223, December.
    11. Siwei Xia & Yuehan Yang & Hu Yang, 2022. "Sparse Laplacian Shrinkage with the Graphical Lasso Estimator for Regression Problems," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(1), pages 255-277, March.
    12. Ning Zhang & Jin Yang, 2023. "Sparse precision matrix estimation with missing observations," Computational Statistics, Springer, vol. 38(3), pages 1337-1355, September.
    13. Pan, Yuqing & Mai, Qing, 2020. "Efficient computation for differential network analysis with applications to quadratic discriminant analysis," Computational Statistics & Data Analysis, Elsevier, vol. 144(C).
    14. Zhang Haixiang & Zheng Yinan & Zhang Zhou & Gao Tao & Joyce Brian & Zhang Wei & Hou Lifang & Liu Lei & Yoon Grace & Schwartz Joel & Vokonas Pantel & Colicino Elena & Baccarelli Andrea, 2017. "Regularized estimation in sparse high-dimensional multivariate regression, with application to a DNA methylation study," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 16(3), pages 159-171, August.
    15. Liu, Weidong & Luo, Xi, 2015. "Fast and adaptive sparse precision matrix estimation in high dimensions," Journal of Multivariate Analysis, Elsevier, vol. 135(C), pages 153-162.
    16. Laura Freijeiro‐González & Manuel Febrero‐Bande & Wenceslao González‐Manteiga, 2022. "A Critical Review of LASSO and Its Derivatives for Variable Selection Under Dependence Among Covariates," International Statistical Review, International Statistical Institute, vol. 90(1), pages 118-145, April.
    17. Ting Wang & Zhao Ren & Ying Ding & Zhou Fang & Zhe Sun & Matthew L MacDonald & Robert A Sweet & Jieru Wang & Wei Chen, 2016. "FastGGM: An Efficient Algorithm for the Inference of Gaussian Graphical Model in Biological Networks," PLOS Computational Biology, Public Library of Science, vol. 12(2), pages 1-16, February.
    18. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    19. Ernesto Carrella & Richard M. Bailey & Jens Koed Madsen, 2018. "Indirect inference through prediction," Papers 1807.01579, arXiv.org.
    20. Rui Wang & Naihua Xiu & Kim-Chuan Toh, 2021. "Subspace quadratic regularization method for group sparse multinomial logistic regression," Computational Optimization and Applications, Springer, vol. 79(3), pages 531-559, July.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1006369. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.