IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1008422.html
   My bibliography  Save this article

Systematic clustering algorithm for chromatin accessibility data and its application to hematopoietic cells

Author

Listed:
  • Azusa Tanaka
  • Yasuhiro Ishitsuka
  • Hiroki Ohta
  • Akihiro Fujimoto
  • Jun-ichirou Yasunaga
  • Masao Matsuoka

Abstract

The huge amount of data acquired by high-throughput sequencing requires data reduction for effective analysis. Here we give a clustering algorithm for genome-wide open chromatin data using a new data reduction method. This method regards the genome as a string of 1s and 0s based on a set of peaks and calculates the Hamming distances between the strings. This algorithm with the systematically optimized set of peaks enables us to quantitatively evaluate differences between samples of hematopoietic cells and classify cell types, potentially leading to a better understanding of leukemia pathogenesis.Author summary: High-throughput sequencing provides us huge amounts of data about gene regulation. In order to extract useful information from the data, data reduction is needed. Although RNA-seq data analysis has been extensively studied, where the focus is mainly on genetic loci, tools for epigenetic sequencing data, such as ATAC-seq data which represent chromatin accessibility, are comparatively lacking. Since the binding of transcription factors mainly occurs in open chromatin regions, it is presumably important to understand how chromatin accessibility landscape affects cell phenotype. In this context, we developed a systematic algorithm to select a set of peaks representing the open state of chromatin for a given sample of ATAC-seq data. This algorithm quantifies the difference between samples by regarding the genome as a string of 1s and 0s with Hamming distances and then performs hierarchical clustering. This algorithm has less computational cost and gives a reasonable cell type classification compared to a previous method. In this work, as an application of this algorithm, we present a comparative analysis of leukemia samples with healthy hematopoietic cells and provide new insights about the relationship between chromatin structures, cell surface proteins, and symptoms in leukemia.

Suggested Citation

  • Azusa Tanaka & Yasuhiro Ishitsuka & Hiroki Ohta & Akihiro Fujimoto & Jun-ichirou Yasunaga & Masao Matsuoka, 2020. "Systematic clustering algorithm for chromatin accessibility data and its application to hematopoietic cells," PLOS Computational Biology, Public Library of Science, vol. 16(11), pages 1-27, November.
  • Handle: RePEc:plo:pcbi00:1008422
    DOI: 10.1371/journal.pcbi.1008422
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008422
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1008422&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1008422?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1008422. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.