IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1002714.html
   My bibliography  Save this article

Confidence-based Somatic Mutation Evaluation and Prioritization

Author

Listed:
  • Martin Löwer
  • Bernhard Y Renard
  • Jos de Graaf
  • Meike Wagner
  • Claudia Paret
  • Christoph Kneip
  • Özlem Türeci
  • Mustafa Diken
  • Cedrik Britten
  • Sebastian Kreiter
  • Michael Koslowski
  • John C Castle
  • Ugur Sahin

Abstract

Next generation sequencing (NGS) has enabled high throughput discovery of somatic mutations. Detection depends on experimental design, lab platforms, parameters and analysis algorithms. However, NGS-based somatic mutation detection is prone to erroneous calls, with reported validation rates near 54% and congruence between algorithms less than 50%. Here, we developed an algorithm to assign a single statistic, a false discovery rate (FDR), to each somatic mutation identified by NGS. This FDR confidence value accurately discriminates true mutations from erroneous calls. Using sequencing data generated from triplicate exome profiling of C57BL/6 mice and B16-F10 melanoma cells, we used the existing algorithms GATK, SAMtools and SomaticSNiPer to identify somatic mutations. For each identified mutation, our algorithm assigned an FDR. We selected 139 mutations for validation, including 50 somatic mutations assigned a low FDR (high confidence) and 44 mutations assigned a high FDR (low confidence). All of the high confidence somatic mutations validated (50 of 50), none of the 44 low confidence somatic mutations validated, and 15 of 45 mutations with an intermediate FDR validated. Furthermore, the assignment of a single FDR to individual mutations enables statistical comparisons of lab and computation methodologies, including ROC curves and AUC metrics. Using the HiSeq 2000, single end 50 nt reads from replicates generate the highest confidence somatic mutation call set. Author Summary: Next generation sequencing (NGS) has enabled unbiased, high throughput discovery of genetic variations and somatic mutations. However, the NGS platform is still prone to errors resulting in inaccurate mutation calls. A statistical measure of the confidence of putative mutation calls would enable researchers to prioritize and select mutations in a robust manner. Here we present our development of a confidence score for mutations calls and apply the method to the identification of somatic mutations in B16 melanoma. We use NGS exome resequencing to profile triplicates of both the reference C57BL/6 mice and the B16-F10 melanoma cells. These replicate data allow us to formulate the false discovery rate of somatic mutations as a statistical quantity. Using this method, we show that 50 of 50 high confidence mutation calls are correct while 0 of 44 low confidence mutations are correct, demonstrating that the method is able to correctly rank mutation calls.

Suggested Citation

  • Martin Löwer & Bernhard Y Renard & Jos de Graaf & Meike Wagner & Claudia Paret & Christoph Kneip & Özlem Türeci & Mustafa Diken & Cedrik Britten & Sebastian Kreiter & Michael Koslowski & John C Castle, 2012. "Confidence-based Somatic Mutation Evaluation and Prioritization," PLOS Computational Biology, Public Library of Science, vol. 8(9), pages 1-11, September.
  • Handle: RePEc:plo:pcbi00:1002714
    DOI: 10.1371/journal.pcbi.1002714
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002714
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1002714&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1002714?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1002714. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.