IDEAS home Printed from https://ideas.repec.org/a/bpj/sagmbi/v14y2015i4p333-345n1.html
   My bibliography  Save this article

Outlier reset CUSUM for the exploration of copy number alteration data

Author

Listed:
  • Lai Yinglei

    (Department of Statistics, The George Washington University, Rome Hall, Room 553, 801 22nd St. NW, Washington, DC 20052, USA)

  • Gastwirth Joseph L.

    (Department of Statistics, The George Washington University, Rome Hall, Room 553, 801 22nd St. NW, Washington, DC 20052, USA)

Abstract

Copy number alteration (CNA) data have been collected to study disease related chromosomal amplifications and deletions. The CUSUM procedure and related plots have been used to explore CNA data. In practice, it is possible to observe outliers. Then, modifications of the CUSUM procedure may be required. An outlier reset modification of the CUSUM (ORCUSUM) procedure is developed in this paper. The threshold value for detecting outliers or significant CUSUMs can be derived using results for sums of independent truncated normal random variables. Bartel’s non-parametric test for autocorrelation is also introduced to the analysis of copy number variation data. Our simulation results indicate that the ORCUSUM procedure can still be used even in the situation where the degree of autocorrelation level is low. Furthermore, the results show the outlier’s impact on the traditional CUSUM’s performance and illustrate the advantage of the ORCUSUM’s outlier reset feature. Additionally, we discuss how the ORCUSUM can be applied to examine CNA data with a simulated data set. To illustrate the procedure, recently collected single nucleotide polymorphism (SNP) based CNA data from The Cancer Genome Atlas (TCGA) Research Network is analyzed. The method is applied to a data set collected in an ovarian cancer study. Three cytogenetic bands (cytobands) are considered to illustrate the method. The cytobands 11q13 and 9p21 have been shown to be related to ovarian cancer. They are presented as positive examples. The cytoband 3q22, which is less likely to be disease related, is presented as a negative example. These results illustrate the usefulness of the ORCUSUM procedure as an exploratory tool for the analysis of SNP based CNA data.

Suggested Citation

  • Lai Yinglei & Gastwirth Joseph L., 2015. "Outlier reset CUSUM for the exploration of copy number alteration data," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 14(4), pages 333-345, August.
  • Handle: RePEc:bpj:sagmbi:v:14:y:2015:i:4:p:333-345:n:1
    DOI: 10.1515/sagmb-2014-0027
    as

    Download full text from publisher

    File URL: https://doi.org/10.1515/sagmb-2014-0027
    Download Restriction: For access to full text, subscription to the journal or payment for the individual article is required.

    File URL: https://libkey.io/10.1515/sagmb-2014-0027?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Hui, Wallace & Gel, Yulia R. & Gastwirth, Joseph L., 2008. "lawstat: An R Package for Law, Public Policy and Biostatistics," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 28(i03).
    2. Hao Chen & Haipeng Xing & Nancy R Zhang, 2011. "Estimation of Parent Specific DNA Copy Number in Tumors using High-Density Genotyping Arrays," PLOS Computational Biology, Public Library of Science, vol. 7(1), pages 1-15, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Punzo, Antonio & Bagnato, Luca, 2022. "Dimension-wise scaled normal mixtures with application to finance and biometry," Journal of Multivariate Analysis, Elsevier, vol. 191(C).
    2. Aiko Sekita & Hiroshi Kawasaki & Ayano Fukushima-Nomura & Kiyoshi Yashiro & Keiji Tanese & Susumu Toshima & Koichi Ashizaki & Tomohiro Miyai & Junshi Yazaki & Atsuo Kobayashi & Shinichi Namba & Tatsuh, 2023. "Multifaceted analysis of cross-tissue transcriptomes reveals phenotype–endotype associations in atopic dermatitis," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    3. Rui Xia & Selina Vattathil & Paul Scheet, 2014. "Identification of Allelic Imbalance with a Statistical Model for Subtle Genomic Mosaicism," PLOS Computational Biology, Public Library of Science, vol. 10(8), pages 1-11, August.
    4. I. Parra-Frutos, 2013. "Testing homogeneity of variances with unequal sample sizes," Computational Statistics, Springer, vol. 28(3), pages 1269-1297, June.
    5. Lyubchich, Vyacheslav & Wang, Xingyu & Heyes, Andrew & Gel, Yulia R., 2016. "A distribution-free m-out-of-n bootstrap approach to testing symmetry about an unknown median," Computational Statistics & Data Analysis, Elsevier, vol. 104(C), pages 1-9.
    6. Do, Linh Phuong Catherine & Lyócsa, Štefan & Molnár, Peter, 2021. "Residual electricity demand: An empirical investigation," Applied Energy, Elsevier, vol. 283(C).
    7. Anat Reiner-Benaim, 2016. "Scan Statistic Tail Probability Assessment Based on Process Covariance and Window Size," Methodology and Computing in Applied Probability, Springer, vol. 18(3), pages 717-745, September.
    8. Guglielmo Lione & Francesca Brescia & Luana Giordano & Paolo Gonthier, 2022. "Effects of Seasonality and Climate on the Propagule Deposition Patterns of the Chestnut Blight Pathogen Cryphonectria parasitica in Orchards of the Alpine District of North Western Italy," Agriculture, MDPI, vol. 12(5), pages 1-24, April.
    9. Víctor Leiva & Jimmy Corzo & Myrian E. Vergara & Raydonal Ospina & Cecilia Castro, 2024. "A Statistical Methodology for Evaluating Asymmetry after Normalization with Application to Genomic Data," Stats, MDPI, vol. 7(3), pages 1-17, September.
    10. Artur Tiago Silva & Maria Manuela Portela, 2018. "Using Climate-Flood Links and CMIP5 Projections to Assess Flood Design Levels Under Climate Change Scenarios: A Case Study in Southern Brazil," Water Resources Management: An International Journal, Published for the European Water Resources Association (EWRA), Springer;European Water Resources Association (EWRA), vol. 32(15), pages 4879-4893, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bpj:sagmbi:v:14:y:2015:i:4:p:333-345:n:1. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.degruyter.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.