IDEAS home Printed from https://ideas.repec.org/a/bpj/sagmbi/v18y2019i4p14n1.html
   My bibliography  Save this article

A penalized regression approach for DNA copy number study using the sequencing data

Author

Listed:
  • Lee Jaeeun
  • Chen Jie

    (Division of Biostatistics and Data Science, Department of Population Health Sciences, Medical College of Georgia, Augusta University, Augusta, GA 30912, USA)

Abstract

Modeling the high-throughput next generation sequencing (NGS) data, resulting from experiments with the goal of profiling tumor and control samples for the study of DNA copy number variants (CNVs), remains to be a challenge in various ways. In this application work, we provide an efficient method for detecting multiple CNVs using NGS reads ratio data. This method is based on a multiple statistical change-points model with the penalized regression approach, 1d fused LASSO, that is designed for ordered data in a one-dimensional structure. In addition, since the path algorithm traces the solution as a function of a tuning parameter, the number and locations of potential CNV region boundaries can be estimated simultaneously in an efficient way. For tuning parameter selection, we then propose a new modified Bayesian information criterion, called JMIC, and compare the proposed JMIC with three different Bayes information criteria used in the literature. Simulation results have shown the better performance of JMIC for tuning parameter selection, in comparison with the other three criterion. We applied our approach to the sequencing data of reads ratio between the breast tumor cell lines HCC1954 and its matched normal cell line BL 1954 and the results are in-line with those discovered in the literature.

Suggested Citation

  • Lee Jaeeun & Chen Jie, 2019. "A penalized regression approach for DNA copy number study using the sequencing data," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 18(4), pages 1-14, August.
  • Handle: RePEc:bpj:sagmbi:v:18:y:2019:i:4:p:14:n:1
    DOI: 10.1515/sagmb-2018-0001
    as

    Download full text from publisher

    File URL: https://doi.org/10.1515/sagmb-2018-0001
    Download Restriction: For access to full text, subscription to the journal or payment for the individual article is required.

    File URL: https://libkey.io/10.1515/sagmb-2018-0001?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Robert Tibshirani & Michael Saunders & Saharon Rosset & Ji Zhu & Keith Knight, 2005. "Sparsity and smoothness via the fused lasso," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(1), pages 91-108, February.
    2. Qian, Junhui & Su, Liangjun, 2016. "Shrinkage Estimation Of Regression Models With Multiple Structural Changes," Econometric Theory, Cambridge University Press, vol. 32(6), pages 1376-1433, December.
    3. Pan, Jianmin & Chen, Jiahua, 2006. "Application of modified information criterion to multiple change point problems," Journal of Multivariate Analysis, Elsevier, vol. 97(10), pages 2221-2241, November.
    4. Nancy R. Zhang & David O. Siegmund, 2007. "A Modified Bayes Information Criterion with Applications to the Analysis of Comparative Genomic Hybridization Data," Biometrics, The International Biometric Society, vol. 63(1), pages 22-32, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Fryzlewicz, Piotr, 2020. "Detecting possibly frequent change-points: Wild Binary Segmentation 2 and steepest-drop model selection," LSE Research Online Documents on Economics 103430, London School of Economics and Political Science, LSE Library.
    2. Ma, Chenchen & Tu, Yundong, 2023. "Shrinkage estimation of multiple threshold factor models," Journal of Econometrics, Elsevier, vol. 235(2), pages 1876-1892.
    3. Sean Jewell & Paul Fearnhead & Daniela Witten, 2022. "Testing for a change in mean after changepoint detection," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(4), pages 1082-1104, September.
    4. Qian, Junhui & Su, Liangjun, 2016. "Shrinkage estimation of common breaks in panel data models via adaptive group fused Lasso," Journal of Econometrics, Elsevier, vol. 191(1), pages 86-109.
    5. Ma, Shujie & Su, Liangjun, 2018. "Estimation of large dimensional factor models with an unknown number of breaks," Journal of Econometrics, Elsevier, vol. 207(1), pages 1-29.
    6. Ma, Chenchen & Tu, Yundong, 2023. "Group fused Lasso for large factor models with multiple structural breaks," Journal of Econometrics, Elsevier, vol. 233(1), pages 132-154.
    7. Weijie Cui & Yong Li, 2023. "Bicluster Analysis of Heterogeneous Panel Data via M-Estimation," Mathematics, MDPI, vol. 11(10), pages 1-19, May.
    8. Gordon J. Ross, 2020. "Tracking the evolution of literary style via Dirichlet–multinomial change point regression," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 183(1), pages 149-167, January.
    9. Hosik Choi & Eunjung Song & Seung-sik Hwang & Woojoo Lee, 2018. "A modified generalized lasso algorithm to detect local spatial clusters for count data," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 102(4), pages 537-563, October.
    10. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    11. Schroeder, Anna Louise & Fryzlewicz, Piotr, 2013. "Adaptive trend estimation in financial time series via multiscale change-point-induced basis recovery," LSE Research Online Documents on Economics 54934, London School of Economics and Political Science, LSE Library.
    12. Mkhadri, Abdallah & Ouhourane, Mohamed, 2013. "An extended variable inclusion and shrinkage algorithm for correlated variables," Computational Statistics & Data Analysis, Elsevier, vol. 57(1), pages 631-644.
    13. Yize Zhao & Matthias Chung & Brent A. Johnson & Carlos S. Moreno & Qi Long, 2016. "Hierarchical Feature Selection Incorporating Known and Novel Biological Information: Identifying Genomic Features Related to Prostate Cancer Recurrence," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1427-1439, October.
    14. Francis X. Diebold & Kamil Yilmaz, 2016. "Trans-Atlantic Equity Volatility Connectedness: U.S. and European Financial Institutions, 2004–2014," Journal of Financial Econometrics, Oxford University Press, vol. 14(1), pages 81-127.
    15. Jian Guo & Elizaveta Levina & George Michailidis & Ji Zhu, 2010. "Pairwise Variable Selection for High-Dimensional Model-Based Clustering," Biometrics, The International Biometric Society, vol. 66(3), pages 793-804, September.
    16. Franck Rapaport & Christina Leslie, 2010. "Determining Frequent Patterns of Copy Number Alterations in Cancer," PLOS ONE, Public Library of Science, vol. 5(8), pages 1-10, August.
    17. Lu Tang & Ling Zhou & Peter X. K. Song, 2019. "Fusion learning algorithm to combine partially heterogeneous Cox models," Computational Statistics, Springer, vol. 34(1), pages 395-414, March.
    18. Young‐Geun Choi & Lawrence P. Hanrahan & Derek Norton & Ying‐Qi Zhao, 2022. "Simultaneous spatial smoothing and outlier detection using penalized regression, with application to childhood obesity surveillance from electronic health records," Biometrics, The International Biometric Society, vol. 78(1), pages 324-336, March.
    19. Molly C. Klanderman & Kathryn B. Newhart & Tzahi Y. Cath & Amanda S. Hering, 2020. "Fault isolation for a complex decentralized waste water treatment facility," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 69(4), pages 931-951, August.
    20. Xu Cheng & Zhipeng Liao & Frank Schorfheide, 2016. "Shrinkage Estimation of High-Dimensional Factor Models with Structural Instabilities," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 83(4), pages 1511-1543.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bpj:sagmbi:v:18:y:2019:i:4:p:14:n:1. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.degruyter.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.