IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1012773.html
   My bibliography  Save this article

AWGE-ESPCA: An edge sparse PCA model based on adaptive noise elimination regularization and weighted gene network for Hermetia illucens genomic data analysis

Author

Listed:
  • Rui Miao
  • Hao-Yang Yu
  • Bing-Jie Zhong
  • Hong-Xia Sun
  • Qiang Xia

Abstract

Hermetia illucens is an important insect resource. Studies have shown that exploring the effects of Cu2+-stressed on the growth and development of the Hermetia illucens genome holds significant scientific importance. There are three major challenges in the current studies of Hermetia illucens genomic data analysis: firstly, the lack of available genomic data which limits researchers in Hermetia illucens genomic data analysis. Secondly, to the best of our knowledge, there are no Artificial Intelligence (AI) feature selection models designed specifically for Hermetia illucens genome. Unlike human genomic data, noise in Hermetia illucens data is a more serious problem. Third, how to choose those genes located in the pathway enrichment region. Existing models assume that each gene probe has the same priori weight. However, researchers usually pay more attention to gene probes which are in the pathway enrichment region. Based on the above challenges, we initially construct experiments and establish a new Cu2+-stressed Hermetia illucens growth genome dataset. Subsequently, we propose AWGE-ESPCA: an edge Sparse PCA model based on adaptive noise elimination regularization and weighted gene network. The AWGE-ESPCA model innovatively proposes an adaptive noise elimination regularization method, effectively addressing the noise challenge in Hermetia illucens genomic data. We also integrate the known gene-pathway quantitative information into the Sparse PCA(SPCA) framework as a priori knowledge, which allows the model to filter out the gene probes in pathway-rich regions as much as possible. Ultimately, this study conducts five independent experiments and compared four latest Sparse PCA models as well as representative supervised and unsupervised baseline models to validate the model performance. The experimental results demonstrate the superior pathway and gene selection capabilities of the AWGE-ESPCA model. Ablation experiments validate the role of the adaptive regularizer and network weighting module. To summarize, this paper presents an innovative unsupervised model for Hermetia illucens genome analysis, which can effectively help researchers identify potential biomarkers. In addition, we also provide a working AWGE - ESPCA model code in the address: https://github.com/yhyresearcher/AWGE_ESPCA.Author summary: Hermetia illucens is an insect of high economic value, which is widely used in the field of feed. Existing research suggests that Cu2+-stressed can significantly affects the growth of Hermetia illucens. Therefore, the identification of genetic target information affecting the growth and development of Hermetia illucens is crucial for food safety. However, due to the lack of high-quality data sets, high data noisy and low sample number. None of the existing genomic analysis models can handle the Hermetia illucens data well. Based on the above problems, a novel unsupervised Hermetia illucens genomic analysis model (AWGE-ESPCA) is proposed in this paper. The AWGE-ESPCA model proposes a daptive noise elimination regularization to solve noise challenges in data and uses weighted gene network to enhance the biological interpretability capability of the model. The experimental results show that the AWGE-ESPCA model can well select potential target genes and key pathways. In addition, we demonstrate that the AWGE-ESPCA model can be extended to other insect genome analysis tasks.

Suggested Citation

  • Rui Miao & Hao-Yang Yu & Bing-Jie Zhong & Hong-Xia Sun & Qiang Xia, 2025. "AWGE-ESPCA: An edge sparse PCA model based on adaptive noise elimination regularization and weighted gene network for Hermetia illucens genomic data analysis," PLOS Computational Biology, Public Library of Science, vol. 21(2), pages 1-23, February.
  • Handle: RePEc:plo:pcbi00:1012773
    DOI: 10.1371/journal.pcbi.1012773
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1012773
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1012773&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1012773?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1012773. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.