IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1007264.html
   My bibliography  Save this article

Machine learning-based microarray analyses indicate low-expression genes might collectively influence PAH disease

Author

Listed:
  • Song Cui
  • Qiang Wu
  • James West
  • Jiangping Bai

Abstract

Accurately predicting and testing the types of Pulmonary arterial hypertension (PAH) of each patient using cost-effective microarray-based expression data and machine learning algorithms could greatly help either identifying the most targeting medicine or adopting other therapeutic measures that could correct/restore defective genetic signaling at the early stage. Furthermore, the prediction model construction processes can also help identifying highly informative genes controlling PAH, leading to enhanced understanding of the disease etiology and molecular pathways. In this study, we used several different gene filtering methods based on microarray expression data obtained from a high-quality patient PAH dataset. Following that, we proposed a novel feature selection and refinement algorithm in conjunction with well-known machine learning methods to identify a small set of highly informative genes. Results indicated that clusters of small-expression genes could be extremely informative at predicting and differentiating different forms of PAH. Additionally, our proposed novel feature refinement algorithm could lead to significant enhancement in model performance. To summarize, integrated with state-of-the-art machine learning and novel feature refining algorithms, the most accurate models could provide near-perfect classification accuracies using very few (close to ten) low-expression genes.Author summary: Pulmonary arterial hypertension (PAH) is a serious and progressive disease, with only a roughly 50% of 5-year survival rate even with best available therapies. Accurately detecting/differentiating different forms of PAH and developing drugs that could directly target at genes involved in PAH pathogenesis are essential. We proposed a computational approach using low-cost microarray data collected from a clinical trial and had accurately predicted each PAH group. In particular, we considered the fact that there might exist some low-expression genes that were usually discarded by researchers but might function collectively and significantly controlling the disease in each case. Therefore, we had developed different filtering algorithms that intentionally selected those low-expression genes for constructing prediction model. Using a few highly informative low-expression genes that had never been extensively investigated before, our systematic approach had produced models that could offer prefect accuracy in predicting PAH. Additionally, our analysis also found that the composition of gene factors controlling the PAH etiology under each form are quite different from each other.

Suggested Citation

  • Song Cui & Qiang Wu & James West & Jiangping Bai, 2019. "Machine learning-based microarray analyses indicate low-expression genes might collectively influence PAH disease," PLOS Computational Biology, Public Library of Science, vol. 15(8), pages 1-25, August.
  • Handle: RePEc:plo:pcbi00:1007264
    DOI: 10.1371/journal.pcbi.1007264
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007264
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1007264&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1007264?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Pan, Qiujing & Dias, Daniel, 2017. "Sliced inverse regression-based sparse polynomial chaos expansions for reliability analysis in high dimensions," Reliability Engineering and System Safety, Elsevier, vol. 167(C), pages 484-493.
    2. Song Cui & Eunseog Youn & Joohyun Lee & Stephan J Maas, 2014. "An Improved Systematic Approach to Predicting Transcription Factor Target Genes Using Support Vector Machine," PLOS ONE, Public Library of Science, vol. 9(4), pages 1-9, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Roy, Atin & Chakraborty, Subrata, 2022. "Reliability analysis of structures by a three-stage sequential sampling based adaptive support vector regression model," Reliability Engineering and System Safety, Elsevier, vol. 219(C).
    2. Xu, Jun & Wang, Ding, 2019. "Structural reliability analysis based on polynomial chaos, Voronoi cells and dimension reduction technique," Reliability Engineering and System Safety, Elsevier, vol. 185(C), pages 329-340.
    3. Wang, Tianzhe & Chen, Zequan & Li, Guofa & He, Jialong & Liu, Chao & Du, Xuejiao, 2024. "A novel method for high-dimensional reliability analysis based on activity score and adaptive Kriging," Reliability Engineering and System Safety, Elsevier, vol. 241(C).
    4. Palar, Pramudita Satria & Zuhal, Lavi Rizki & Shimoyama, Koji, 2023. "Enhancing the explainability of regression-based polynomial chaos expansion by Shapley additive explanations," Reliability Engineering and System Safety, Elsevier, vol. 232(C).
    5. Du, Weiqi & Luo, Yuanxin & Wang, Yongqin, 2019. "Time-variant reliability analysis using the parallel subset simulation," Reliability Engineering and System Safety, Elsevier, vol. 182(C), pages 250-257.
    6. Zhang, Jinhao & Xiao, Mi & Gao, Liang, 2019. "An active learning reliability method combining Kriging constructed with exploration and exploitation of failure region and subset simulation," Reliability Engineering and System Safety, Elsevier, vol. 188(C), pages 90-102.
    7. Ding, Jiayi & Zhou, Jianfang & Cai, Wei, 2023. "An efficient variable selection-based Kriging model method for the reliability analysis of slopes with spatially variable soils," Reliability Engineering and System Safety, Elsevier, vol. 235(C).
    8. Cheng, Jin & Wang, Jian & Wu, Xuezhou & Wang, Shuo, 2019. "An improved polynomial-based nonlinear variable importance measure and its application to degradation assessment for high-voltage transformer under imbalance data," Reliability Engineering and System Safety, Elsevier, vol. 185(C), pages 175-191.
    9. Yao, Wen & Zheng, Xiaohu & Zhang, Jun & Wang, Ning & Tang, Guijian, 2023. "Deep adaptive arbitrary polynomial chaos expansion: A mini-data-driven semi-supervised method for uncertainty quantification," Reliability Engineering and System Safety, Elsevier, vol. 229(C).
    10. Dang, Chao & Xu, Jun, 2020. "Unified reliability assessment for problems with low- to high-dimensional random inputs using the Laplace transform and a mixture distribution," Reliability Engineering and System Safety, Elsevier, vol. 204(C).
    11. Wang, Jinsheng & Xu, Guoji & Yuan, Peng & Li, Yongle & Kareem, Ahsan, 2024. "An efficient and versatile Kriging-based active learning method for structural reliability analysis," Reliability Engineering and System Safety, Elsevier, vol. 241(C).
    12. Zhou, Yicheng & Lu, Zhenzhou & Yun, Wanying, 2020. "Active sparse polynomial chaos expansion for system reliability analysis," Reliability Engineering and System Safety, Elsevier, vol. 202(C).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1007264. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.