IDEAS home Printed from https://ideas.repec.org/a/eee/phsmap/v462y2016icp527-559.html
   My bibliography  Save this article

Factor models for cancer signatures

Author

Listed:
  • Kakushadze, Zura
  • Yu, Willie

Abstract

We present a novel method for extracting cancer signatures by applying statistical risk models (http://ssrn.com/abstract=2732453) from quantitative finance to cancer genome data. Using 1389 whole genome sequenced samples from 14 cancers, we identify an “overall” mode of somatic mutational noise. We give a prescription for factoring out this noise and source code for fixing the number of signatures. We apply nonnegative matrix factorization (NMF) to genome data aggregated by cancer subtype and filtered using our method. The resultant signatures have substantially lower variability than those from unfiltered data. Also, the computational cost of signature extraction is cut by about a factor of 10. We find 3 novel cancer signatures, including a liver cancer dominant signature (96% contribution) and a renal cell carcinoma signature (70% contribution). Our method accelerates finding new cancer signatures and improves their overall stability. Reciprocally, the methods for extracting cancer signatures could have interesting applications in quantitative finance.

Suggested Citation

  • Kakushadze, Zura & Yu, Willie, 2016. "Factor models for cancer signatures," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 462(C), pages 527-559.
  • Handle: RePEc:eee:phsmap:v:462:y:2016:i:c:p:527-559
    DOI: 10.1016/j.physa.2016.06.089
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0378437116303648
    Download Restriction: Full text for ScienceDirect subscribers only. Journal offers the option of making the article available online on Science direct for a fee of $3,000

    File URL: https://libkey.io/10.1016/j.physa.2016.06.089?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Xose S. Puente & Magda Pinyol & Víctor Quesada & Laura Conde & Gonzalo R. Ordóñez & Neus Villamor & Georgia Escaramis & Pedro Jares & Sílvia Beà & Marcos González-Díaz & Laia Bassaganyas & Tycho Bauma, 2011. "Whole-genome sequencing identifies recurrent mutations in chronic lymphocytic leukaemia," Nature, Nature, vol. 475(7354), pages 101-105, July.
    2. Ghislaine Scelo & Yasser Riazalhosseini & Liliana Greger & Louis Letourneau & Mar Gonzàlez-Porta & Magdalena B. Wozniak & Mathieu Bourgey & Patricia Harnden & Lars Egevad & Sharon M. Jackson & Mehran , 2014. "Variation in genomic landscape of clear cell renal cell carcinoma across Europe," Nature Communications, Nature, vol. 5(1), pages 1-13, December.
    3. Daniel D. Lee & H. Sebastian Seung, 1999. "Learning the parts of objects by non-negative matrix factorization," Nature, Nature, vol. 401(6755), pages 788-791, October.
    4. Zura Kakushadze, 2015. "Heterotic Risk Models," Papers 1508.04883, arXiv.org, revised Jan 2016.
    5. Harry Markowitz, 1952. "Portfolio Selection," Journal of Finance, American Finance Association, vol. 7(1), pages 77-91, March.
    6. Xose S. Puente & Silvia Beà & Rafael Valdés-Mas & Neus Villamor & Jesús Gutiérrez-Abril & José I. Martín-Subero & Marta Munar & Carlota Rubio-Pérez & Pedro Jares & Marta Aymerich & Tycho Baumann & Ren, 2015. "Non-coding recurrent mutations in chronic lymphocytic leukaemia," Nature, Nature, vol. 526(7574), pages 519-524, October.
    7. Ludmil B. Alexandrov & Serena Nik-Zainal & David C. Wedge & Samuel A. J. R. Aparicio & Sam Behjati & Andrew V. Biankin & Graham R. Bignell & Niccolò Bolli & Ake Borg & Anne-Lise Børresen-Dale & Sandri, 2013. "Signatures of mutational processes in human cancer," Nature, Nature, vol. 500(7463), pages 415-421, August.
    8. Zura Kakushadze & Willie Yu, 2016. "Statistical Risk Models," Papers 1602.08070, arXiv.org, revised Jan 2017.
    9. Nicola Waddell & Marina Pajic & Ann-Marie Patch & David K. Chang & Karin S. Kassahn & Peter Bailey & Amber L. Johns & David Miller & Katia Nones & Kelly Quek & Michael C. J. Quinn & Alan J. Robertson , 2015. "Whole genomes redefine the mutational landscape of pancreatic cancer," Nature, Nature, vol. 518(7540), pages 495-501, February.
    10. David T. W. Jones & Natalie Jäger & Marcel Kool & Thomas Zichner & Barbara Hutter & Marc Sultan & Yoon-Jae Cho & Trevor J. Pugh & Volker Hovestadt & Adrian M. Stütz & Tobias Rausch & Hans-Jörg Warnatz, 2012. "Dissecting the genomic complexity underlying medulloblastoma," Nature, Nature, vol. 488(7409), pages 100-105, August.
    11. Gunes Gundem & Peter Van Loo & Barbara Kremeyer & Ludmil B. Alexandrov & Jose M. C. Tubio & Elli Papaemmanuil & Daniel S. Brewer & Heini M. L. Kallio & Gunilla Högnäs & Matti Annala & Kati Kivinummi &, 2015. "The evolutionary history of lethal metastatic prostate cancer," Nature, Nature, vol. 520(7547), pages 353-357, April.
    12. Zura Kakushadze & Willie Yu, 2016. "Multifactor Risk Models and Heterotic CAPM," Papers 1602.04902, arXiv.org, revised Mar 2016.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Zura Kakushadze & Willie Yu, 2017. "Mutation Clusters from Cancer Exome," Papers 1707.08504, arXiv.org.
    2. Chen, Shunjie & Yang, Sijia & Wang, Pei & Xue, Liugen, 2023. "Two-stage penalized algorithms via integrating prior information improve gene selection from omics data," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 628(C).
    3. Zura Kakushadze & Willie Yu, 2017. "*K-means and Cluster Models for Cancer Signatures," Papers 1703.00703, arXiv.org, revised Jul 2017.
    4. Zura Kakushadze & Willie Yu, 2020. "Machine Learning Treasury Yields," Bulletin of Applied Economics, Risk Market Journals, vol. 7(1), pages 1-65.
    5. Zura Kakushadze & Willie Yu, 2020. "Machine Learning Treasury Yields," Papers 2003.05095, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Zura Kakushadze & Willie Yu, 2016. "Factor Models for Cancer Signatures," Papers 1604.08743, arXiv.org, revised Jan 2017.
    2. Zura Kakushadze & Willie Yu, 2017. "Decoding Stock Market with Quant Alphas," Papers 1708.02984, arXiv.org.
    3. Zura Kakushadze & Willie Yu, 2017. "Dead Alphas as Risk Factors," Papers 1709.06641, arXiv.org.
    4. Zura Kakushadze & Willie Yu, 2018. "Decoding stock market with quant alphas," Journal of Asset Management, Palgrave Macmillan, vol. 19(1), pages 38-48, January.
    5. Zura Kakushadze & Willie Yu, 2017. "*K-means and Cluster Models for Cancer Signatures," Papers 1703.00703, arXiv.org, revised Jul 2017.
    6. Zura Kakushadze & Willie Yu, 2017. "Notes on Fano Ratio and Portfolio Optimization," Papers 1711.10640, arXiv.org, revised Apr 2018.
    7. Zura Kakushadze & Willie Yu, 2018. "Dead alphas as risk factors," Journal of Asset Management, Palgrave Macmillan, vol. 19(2), pages 110-115, March.
    8. Zura Kakushadze & Willie Yu, 2016. "Statistical Risk Models," Papers 1602.08070, arXiv.org, revised Jan 2017.
    9. Anna Luiza Silva Almeida Vicente & Alexei Novoloaca & Vincent Cahais & Zainab Awada & Cyrille Cuenin & Natália Spitz & André Lopes Carvalho & Adriane Feijó Evangelista & Camila Souza Crovador & Rui Ma, 2022. "Cutaneous and acral melanoma cross-OMICs reveals prognostic cancer drivers associated with pathobiology and ultraviolet exposure," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    10. Zura Kakushadze & Willie Yu, 2018. "Betas, Benchmarks and Beating the Market," Papers 1807.09919, arXiv.org.
    11. Zura Kakushadze & Willie Yu, 2016. "Statistical Industry Classification," Papers 1607.04883, arXiv.org, revised Dec 2018.
    12. Zura Kakushadze & Willie Yu, 2017. "Mutation Clusters from Cancer Exome," Papers 1707.08504, arXiv.org.
    13. Zura Kakushadze & Willie Yu, 2019. "Machine Learning Risk Models," Papers 1903.06334, arXiv.org, revised Apr 2019.
    14. Wanke, Peter & Chen, Zhongfei & Dong, Qichen & Antunes, Jorge, 2021. "Transportation Sustainability, Macroeconomics, and Endogeneity in China: A Hybrid Neural-Markowitz-Variable Reduction Approach," Technological Forecasting and Social Change, Elsevier, vol. 170(C).
    15. Zura Kakushadze & Willie Yu, 2017. "Open Source Fundamental Industry Classification," Data, MDPI, vol. 2(2), pages 1-77, June.
    16. Zura Kakushadze & Willie Yu, 2021. "ETF Risk Models," Papers 2110.07138, arXiv.org.
    17. Zura Kakushadze & Willie Yu, 2020. "Machine Learning Treasury Yields," Papers 2003.05095, arXiv.org.
    18. Qingli Guo & Eszter Lakatos & Ibrahim Al Bakir & Kit Curtius & Trevor A. Graham & Ville Mustonen, 2022. "The mutational signatures of formalin fixation on the human genome," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    19. Zura Kakushadze & Willie Yu, 2020. "Machine Learning Treasury Yields," Bulletin of Applied Economics, Risk Market Journals, vol. 7(1), pages 1-65.
    20. Hailiang Zhang & Lin Bai & Xin-Qiang Wu & Xi Tian & Jinwen Feng & Xiaohui Wu & Guo-Hai Shi & Xiaoru Pei & Jiacheng Lyu & Guojian Yang & Yang Liu & Wenhao Xu & Aihetaimujiang Anwaier & Yu Zhu & Da-Long, 2023. "Proteogenomics of clear cell renal cell carcinoma response to tyrosine kinase inhibitor," Nature Communications, Nature, vol. 14(1), pages 1-21, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:phsmap:v:462:y:2016:i:c:p:527-559. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.journals.elsevier.com/physica-a-statistical-mechpplications/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.