IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1009542.html
   My bibliography  Save this article

A data-driven approach for constructing mutation categories for mutational signature analysis

Author

Listed:
  • Gal Gilad
  • Mark D M Leiserson
  • Roded Sharan

Abstract

Mutational processes shape the genomes of cancer patients and their understanding has important applications in diagnosis and treatment. Current modeling of mutational processes by identifying their characteristic signatures views each base substitution in a limited context of a single flanking base on each side. This context definition gives rise to 96 categories of mutations that have become the standard in the field, even though wider contexts have been shown to be informative in specific cases. Here we propose a data-driven approach for constructing a mutation categorization for mutational signature analysis. Our approach is based on the assumption that tumor cells that are exposed to similar mutational processes, show similar expression levels of DNA damage repair genes that are involved in these processes. We attempt to find a categorization that maximizes the agreement between mutation and gene expression data, and show that it outperforms the standard categorization over multiple quality measures. Moreover, we show that the categorization we identify generalizes to unseen data from different cancer types, suggesting that mutation context patterns extend beyond the immediate flanking bases.Author summary: Cancer is a group of genetic diseases that occur as a result of an accumulation of somatic mutations in genes that regulate cellular growth and differentiation. These mutations arise from mutagenic processes such as exposure to environmental mutagens and defective DNA damage repair pathways. Each of these processes results in a characteristic pattern of mutations, referred to as a mutational signature. These signatures reveal the mutagenic mechanisms that have influenced the development of a specific tumor, and thus provide new insights into its causes and potential treatments. Originally, a mutational signature has been defined using 96 mutation categories that take into account solely the information from the mutated base and its flanking bases. Here, we aim to challenge this arbitrary categorization, which is widely used in mutational signature analysis. We have developed a novel framework for the construction of mutation categories that is based on the assumption that the activities of DNA damage repair genes are correlated with the mutational processes that are active in a given tumor. We show that using this approach we are able to identify an alternative mutation categorization that outperforms the standard categorization with respect to multiple metrics. This categorization includes categories that account for bases that extend beyond the immediate flanking bases, suggesting that mutational signatures should be studied in broader sequence contexts.

Suggested Citation

  • Gal Gilad & Mark D M Leiserson & Roded Sharan, 2021. "A data-driven approach for constructing mutation categories for mutational signature analysis," PLOS Computational Biology, Public Library of Science, vol. 17(10), pages 1-15, October.
  • Handle: RePEc:plo:pcbi00:1009542
    DOI: 10.1371/journal.pcbi.1009542
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1009542
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1009542&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1009542?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Jennifer Ma & Jeremy Setton & Nancy Y. Lee & Nadeem Riaz & Simon N. Powell, 2018. "The therapeutic significance of mutational signatures from DNA repair deficiency in cancer," Nature Communications, Nature, vol. 9(1), pages 1-12, December.
    2. N. J. Haradhvala & J. Kim & Y. E. Maruvka & P. Polak & D. Rosebrock & D. Livitz & J. M. Hess & I. Leshchiner & A. Kamburov & K. W. Mouw & M. S. Lawrence & G. Getz, 2018. "Distinct mutational signatures characterize concurrent loss of polymerase proofreading and mismatch repair," Nature Communications, Nature, vol. 9(1), pages 1-9, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Rotem Katzir & Noam Rudberg & Keren Yizhak, 2022. "Estimating tumor mutational burden from RNA-sequencing without a matched-normal sample," Nature Communications, Nature, vol. 13(1), pages 1-10, December.
    2. Qingli Guo & Eszter Lakatos & Ibrahim Al Bakir & Kit Curtius & Trevor A. Graham & Ville Mustonen, 2022. "The mutational signatures of formalin fixation on the human genome," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    3. Snaedis Kristmundsdottir & Hakon Jonsson & Marteinn T. Hardarson & Gunnar Palsson & Doruk Beyter & Hannes P. Eggertsson & Arnaldur Gylfason & Gardar Sveinbjornsson & Guillaume Holley & Olafur A. Stefa, 2023. "Sequence variants affecting the genome-wide rate of germline microsatellite mutations," Nature Communications, Nature, vol. 14(1), pages 1-12, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1009542. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.