IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0317283.html
   My bibliography  Save this article

SEMdag: Fast learning of Directed Acyclic Graphs via node or layer ordering

Author

Listed:
  • Mario Grassi
  • Barbara Tarantino

Abstract

A Directed Acyclic Graph (DAG) offers an easy approach to define causal structures among gathered nodes: causal linkages are represented by arrows between the variables, leading from cause to effect. Recently, industry and academics have paid close attention to DAG structure learning from observable data, and many techniques have been put out to address the problem. We provide a two-step approach, named SEMdag(), that can be used to quickly learn high-dimensional linear SEMs. It is included in the R package SEMgraph and employs a two-stage order-based search using previous knowledge (Knowledge-based, KB) or data-driven method (Bottom-up, BU), under the premise that a linear SEM with equal variance error terms is assumed. We evaluated our framework’s for finding plausible DAGs against six well-known causal discovery techniques (ARGES, GES, PC, LiNGAM, CAM, NOTEARS). We conducted a series of experiments using observed expression (or RNA-seq) data, taking into account a pair of training and testing datasets for four distinct diseases: Amyotrophic Lateral Sclerosis (ALS), Breast cancer (BRCA), Coronavirus disease (COVID-19) and ST-elevation myocardial infarction (STEMI). The results show that the SEMdag() procedure can recover a graph structure with good disease prediction performance evaluated by a conventional supervised learning algorithm (RF): in the scenario where the initial graph is sparse, the BU approach may be a better choice than the KB one; in the case where the graph is denser, both BU an KB report high performance, with highest score for KB approach based on topological layers. Besides its superior disease predictive performance compared to previous research, SEMdag() offers the user the flexibility to define distinct structure learning algorithms and can handle high dimensional issues with less computing load. SEMdag() function is implemented in the R package SEMgraph, easily available at https://CRAN.R-project.org/package=SEMgraph.

Suggested Citation

  • Mario Grassi & Barbara Tarantino, 2025. "SEMdag: Fast learning of Directed Acyclic Graphs via node or layer ordering," PLOS ONE, Public Library of Science, vol. 20(1), pages 1-24, January.
  • Handle: RePEc:plo:pone00:0317283
    DOI: 10.1371/journal.pone.0317283
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0317283
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0317283&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0317283?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. J. Peters & P. Bühlmann, 2014. "Identifiability of Gaussian structural equation models with equal error variances," Biometrika, Biometrika Trust, vol. 101(1), pages 219-228.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Fangting Zhou & Kejun He & Yang Ni, 2023. "Individualized causal discovery with latent trajectory embedded Bayesian networks," Biometrics, The International Biometric Society, vol. 79(4), pages 3191-3202, December.
    2. Lan Luo, By & Shi, Chengchun & Wang, Jitao & Wu, Zhenke & Li, Lexin, 2025. "Multivariate dynamic mediation analysis under a reinforcement learning framework," LSE Research Online Documents on Economics 127112, London School of Economics and Political Science, LSE Library.
    3. Federico Castelletti & Guido Consonni, 2021. "Bayesian inference of causal effects from observational data in Gaussian graphical models," Biometrics, The International Biometric Society, vol. 77(1), pages 136-149, March.
    4. Fangting Zhou & Kejun He & Kunbo Wang & Yanxun Xu & Yang Ni, 2023. "Functional Bayesian networks for discovering causality from multivariate functional data," Biometrics, The International Biometric Society, vol. 79(4), pages 3279-3293, December.
    5. Xiao Guo & Hai Zhang, 2020. "Sparse directed acyclic graphs incorporating the covariates," Statistical Papers, Springer, vol. 61(5), pages 2119-2148, October.
    6. C Schultheiss & P Bühlmann, 2023. "Ancestor regression in linear structural equation models," Biometrika, Biometrika Trust, vol. 110(4), pages 1117-1124.
    7. Castelletti, Federico & Peluso, Stefano, 2021. "Equivalence class selection of categorical graphical models," Computational Statistics & Data Analysis, Elsevier, vol. 164(C).
    8. Park, Gunwoong & Kim, Yesool, 2021. "Learning high-dimensional Gaussian linear structural equation models with heterogeneous error variances," Computational Statistics & Data Analysis, Elsevier, vol. 154(C).
    9. Li, Lexin & Shi, Chengchun & Guo, Tengfei & Jagust, William J., 2022. "Sequential pathway inference for multimodal neuroimaging analysis," LSE Research Online Documents on Economics 111904, London School of Economics and Political Science, LSE Library.
    10. Wang, Bingling & Zhou, Qing, 2021. "Causal network learning with non-invertible functional relationships," Computational Statistics & Data Analysis, Elsevier, vol. 156(C).
    11. Nikolaos Petrakis & Stefano Peluso & Dimitris Fouskakis & Guido Consonni, 2020. "Objective methods for graphical structural learning," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 74(3), pages 420-438, August.
    12. Choi, Semin & Kim, Yesool & Park, Gunwoong, 2023. "Densely connected sub-Gaussian linear structural equation model learning via ℓ1- and ℓ2-regularized regressions," Computational Statistics & Data Analysis, Elsevier, vol. 181(C).
    13. Ying Zhou, 2025. "Causal Inference with Secondary Outcomes," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 17(1), pages 3-16, April.
    14. Jonas Peters & Peter Bühlmann & Nicolai Meinshausen, 2016. "Causal inference by using invariant prediction: identification and confidence intervals," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 78(5), pages 947-1012, November.
    15. Federico Castelletti & Guido Consonni, 2020. "Discovering causal structures in Bayesian Gaussian directed acyclic graph models," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 183(4), pages 1727-1745, October.
    16. Shi, Chengchun & Li, Lexin, 2022. "Testing mediation effects using logic of Boolean matrices," LSE Research Online Documents on Economics 108881, London School of Economics and Political Science, LSE Library.
    17. Federico Castelletti, 2020. "Bayesian Model Selection of Gaussian Directed Acyclic Graph Structures," International Statistical Review, International Statistical Institute, vol. 88(3), pages 752-775, December.
    18. Aapo Hyvärinen & Ilyes Khemakhem & Ricardo Monti, 2024. "Identifiability of latent-variable and structural-equation models: from linear to nonlinear," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 76(1), pages 1-33, February.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0317283. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.