IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1011197.html

Identifying prognostic subgroups of luminal-A breast cancer using deep autoencoders and gene expressions

Author

Listed:
  • Seunghyun Wang
  • Doheon Lee

Abstract

Luminal-A breast cancer is the most frequently occurring subtype which is characterized by high expression levels of hormone receptors. However, some luminal-A breast cancer patients suffer from intrinsic and/or acquired resistance to endocrine therapies which are considered as first-line treatments for luminal-A breast cancer. This heterogeneity within luminal-A breast cancer has required a more precise stratification method. Hence, our study aims to identify prognostic subgroups of luminal-A breast cancer. In this study, we discovered two prognostic subgroups of luminal-A breast cancer (BPS-LumA and WPS-LumA) using deep autoencoders and gene expressions. The deep autoencoders were trained using gene expression profiles of 679 luminal-A breast cancer samples in the METABRIC dataset. Then, latent features of each samples generated from the deep autoencoders were used for K-Means clustering to divide the samples into two subgroups, and Kaplan-Meier survival analysis was performed to compare prognosis (recurrence-free survival) between them. As a result, the prognosis between the two subgroups were significantly different (p-value = 5.82E-05; log-rank test). This prognostic difference between two subgroups was validated using gene expression profiles of 415 luminal-A breast cancer samples in the TCGA BRCA dataset (p-value = 0.004; log-rank test). Notably, the latent features were superior to the gene expression profiles and traditional dimensionality reduction method in terms of discovering the prognostic subgroups. Lastly, we discovered that ribosome-related biological functions could be potentially associated with the prognostic difference between them using differentially expressed genes and co-expression network analysis. Our stratification method can be contributed to understanding a complexity of luminal-A breast cancer and providing a personalized medicine.Author summary: Luminal-A breast cancer is the most frequently occurring breast cancer subtype. However, it shows high variability in prognosis, and more precise stratification is needed. In this paper, we identified two prognostic subgroups of luminal-A breast cancer, BPS-LumA and WPS-LumA. To this end, we used deep autoencoders which automatically generate informative latent features that represent essential properties of gene expressions. We found that the two subgroups clustered using the latent features are significantly different in prognosis. This prognostic difference was validated with the external luminal-A breast cancer cohort. We showed that only latent features are able to discover the prognostic subgroups compared to gene expression profiles. In addition, we compare our results with the two previous luminal-A breast cancer stratification method which are complementary to each other. Finally, we suggested biological functions associated with the differentially expressed genes between the two subgroups as potential molecular mechanisms which results in the differences in the prognosis. We expect that our method could be used for the personalized medicine of luminal-A breast cancer.

Suggested Citation

  • Seunghyun Wang & Doheon Lee, 2023. "Identifying prognostic subgroups of luminal-A breast cancer using deep autoencoders and gene expressions," PLOS Computational Biology, Public Library of Science, vol. 19(5), pages 1-18, May.
  • Handle: RePEc:plo:pcbi00:1011197
    DOI: 10.1371/journal.pcbi.1011197
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1011197
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1011197&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1011197?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. H. Jeong & S. P. Mason & A.-L. Barabási & Z. N. Oltvai, 2001. "Lethality and centrality in protein networks," Nature, Nature, vol. 411(6833), pages 41-42, May.
    2. Bernard Pereira & Suet-Feung Chin & Oscar M. Rueda & Hans-Kristian Moen Vollan & Elena Provenzano & Helen A. Bardwell & Michelle Pugh & Linda Jones & Roslin Russell & Stephen-John Sammut & Dana W. Y. , 2016. "Erratum: The somatic mutation profiles of 2,433 breast cancers refine their genomic and transcriptomic landscapes," Nature Communications, Nature, vol. 7(1), pages 1-1, September.
    3. Bernard Pereira & Suet-Feung Chin & Oscar M. Rueda & Hans-Kristian Moen Vollan & Elena Provenzano & Helen A. Bardwell & Michelle Pugh & Linda Jones & Roslin Russell & Stephen-John Sammut & Dana W. Y. , 2016. "The somatic mutation profiles of 2,433 breast cancers refine their genomic and transcriptomic landscapes," Nature Communications, Nature, vol. 7(1), pages 1-16, September.
    4. Charles M. Perou & Therese Sørlie & Michael B. Eisen & Matt van de Rijn & Stefanie S. Jeffrey & Christian A. Rees & Jonathan R. Pollack & Douglas T. Ross & Hilde Johnsen & Lars A. Akslen & Øystein Flu, 2000. "Molecular portraits of human breast tumours," Nature, Nature, vol. 406(6797), pages 747-752, August.
    5. Sanjiv K. Dwivedi & Andreas Tjärnberg & Jesper Tegnér & Mika Gustafsson, 2020. "Deriving disease modules from the compressed transcriptional space embedded in a deep autoencoder," Nature Communications, Nature, vol. 11(1), pages 1-10, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Peter Eirew & Ciara O’Flanagan & Jerome Ting & Sohrab Salehi & Jazmine Brimhall & Beixi Wang & Justina Biele & Teresa Algara & So Ra Lee & Corey Hoang & Damian Yap & Steven McKinney & Cherie Bates & E, 2022. "Accurate determination of CRISPR-mediated gene fitness in transplantable tumours," Nature Communications, Nature, vol. 13(1), pages 1-19, December.
    2. Anjana Mondal & Somesh Kumar, 2025. "Testing for trend in two-way heteroscedastic ANCOVA models," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 34(3), pages 714-741, September.
    3. Zifeng Wang & Fang Liu & Nana Chen & Jingjing Wu & Xinhao Li & Mouxiang Fang & Min Yan & Ji Zhang & Bing Deng & Lulu Wang & Xuan Wang & Meiling Liu & Deshun Zeng & Zhengzhi Zou & Bo Wang & Zhou Songya, 2025. "Chromatin looping-based CRISPR screen identifies TLK2 as chromatin loop formation regulator in cancer stemness plasticity," Nature Communications, Nature, vol. 16(1), pages 1-17, December.
    4. Chotaro Onaga & Shoma Tamori & Izumi Matsuoka & Ayaka Ozaki & Hitomi Motomura & Yuka Nagashima & Tsugumichi Sato & Keiko Sato & Yuyun Xiong & Kazunori Sasaki & Shigeo Ohno & Kazunori Akimoto, 2022. "High expression of SLC20A1 is less effective for endocrine therapy and predicts late recurrence in ER-positive breast cancer," PLOS ONE, Public Library of Science, vol. 17(5), pages 1-22, May.
    5. Yang, Xi & Hoadley, Katherine A. & Hannig, Jan & Marron, J.S., 2023. "Jackstraw inference for AJIVE data integration," Computational Statistics & Data Analysis, Elsevier, vol. 180(C).
    6. Erhan Bilal & Janusz Dutkowski & Justin Guinney & In Sock Jang & Benjamin A Logsdon & Gaurav Pandey & Benjamin A Sauerwine & Yishai Shimoni & Hans Kristian Moen Vollan & Brigham H Mecham & Oscar M Rue, 2013. "Improving Breast Cancer Survival Analysis through Competition-Based Multidimensional Modeling," PLOS Computational Biology, Public Library of Science, vol. 9(5), pages 1-16, May.
    7. M. G. Filippone & D. Gaglio & R. Bonfanti & F. A. Tucci & E. Ceccacci & R. Pennisi & M. Bonanomi & G. Jodice & M. Tillhon & F. Montani & G. Bertalot & S. Freddi & M. Vecchi & A. Taglialatela & M. Roma, 2022. "CDK12 promotes tumorigenesis but induces vulnerability to therapies inhibiting folate one-carbon metabolism in breast cancer," Nature Communications, Nature, vol. 13(1), pages 1-19, December.
    8. Giorgio Jansen & Tanda Qi & Vito Latora & Grigoris D. Amoutzias & Daniela Delneri & Stephen G. Oliver & Giuseppe Nicosia, 2024. "Minimisation of metabolic networks defines a new functional class of genes," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    9. Piaopiao Chen & Agnès H. Michel & Jianzhi Zhang, 2022. "Transposon insertional mutagenesis of diverse yeast strains suggests coordinated gene essentiality polymorphisms," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    10. Manish G & Anil Kumar Badana & Rama Rao Malla, 2017. "Emerging Diagnostic and Prognostic Biomarkers of Triple Negative Breast Cancer," Biomedical Journal of Scientific & Technical Research, Biomedical Research Network+, LLC, vol. 1(3), pages 561-565, August.
    11. Maurizio Callari & Antonio Lembo & Giampaolo Bianchini & Valeria Musella & Vera Cappelletti & Luca Gianni & Maria Grazia Daidone & Paolo Provero, 2014. "Accurate Data Processing Improves the Reliability of Affymetrix Gene Expression Profiles from FFPE Samples," PLOS ONE, Public Library of Science, vol. 9(1), pages 1-10, January.
    12. Jacob Elnaggar & Fern Tsien & Lucio Miele & Chindo Hicks & Clayton Yates & Melisa Davis, 2019. "An Integrative Genomics Approach for Associating Genetic Susceptibility with the Tumor Immune Microenvironment in Triple Negative Breast Cancer," Biomedical Journal of Scientific & Technical Research, Biomedical Research Network+, LLC, vol. 15(1), pages 1-12, February.
    13. Egashira, Kento & Yata, Kazuyoshi & Aoshima, Makoto, 2024. "Asymptotic properties of hierarchical clustering in high-dimensional settings," Journal of Multivariate Analysis, Elsevier, vol. 199(C).
    14. Patrick J. Cunniff & Nicole Sivetz & Damianos Skopelitis & Olaf Klingbeil & Daniel Toobian & Diogo Maia-Silva & Mikala Egeblad & Christopher R. Vakoc, 2025. "KLF5 enables dichotomous lineage programs in pancreatic cancer via the AAA+ ATPase coactivators RUVBL1 and RUVBL2," Nature Communications, Nature, vol. 16(1), pages 1-20, December.
    15. Kim, Jongkwang & Wilhelm, Thomas, 2008. "What is a complex graph?," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 387(11), pages 2637-2652.
    16. María Elena Martínez & Jonathan T Unkart & Li Tao & Candyce H Kroenke & Richard Schwab & Ian Komenaka & Scarlett Lin Gomez, 2017. "Prognostic significance of marital status in breast cancer survival: A population-based study," PLOS ONE, Public Library of Science, vol. 12(5), pages 1-14, May.
    17. Marie Tuomarila & Kaisa Luostari & Ylermi Soini & Vesa Kataja & Veli-Matti Kosma & Arto Mannermaa, 2014. "Overexpression of MicroRNA-200c Predicts Poor Outcome in Patients with PR-Negative Breast Cancer," PLOS ONE, Public Library of Science, vol. 9(10), pages 1-8, October.
    18. Yishai Shimoni, 2018. "Association between expression of random gene sets and survival is evident in multiple cancer types and may be explained by sub-classification," PLOS Computational Biology, Public Library of Science, vol. 14(2), pages 1-15, February.
    19. repec:plo:pone00:0103514 is not listed on IDEAS
    20. Xian, Yishu & Li, Meizhu & Zhang, Qi, 2025. "A k-shell decomposition structural entropy of complex networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 676(C).
    21. Nazimah Ab Mumin & Marlina Tanty Ramli Hamid & Jeannie Hsiu Ding Wong & Seow-Fan Chiew & Kartini Rahmat & Kwan Hoong Ng, 2024. "Investigation of breast cancer molecular subtype in a multi-ethnic population using MRI," PLOS ONE, Public Library of Science, vol. 19(8), pages 1-14, August.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1011197. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.