IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0224806.html
   My bibliography  Save this article

LFastqC: A lossless non-reference-based FASTQ compressor

Author

Listed:
  • Sultan Al Yami
  • Chun-Hsi Huang

Abstract

The cost-effectiveness of next-generation sequencing (NGS) has led to the advancement of genomic research, thereby regularly generating a large amount of raw data that often requires efficient infrastructures such as data centers to manage the storage and transmission of such data. The generated NGS data are highly redundant and need to be efficiently compressed to reduce the cost of storage space and transmission bandwidth. We present a lossless, non-reference-based FASTQ compression algorithm, known as LFastqC, an improvement over the LFQC tool, to address these issues. LFastqC is compared with several state-of-the-art compressors, and the results indicate that LFastqC achieves better compression ratios for important datasets such as the LS454, PacBio, and MinION. Moreover, LFastqC has a better compression and decompression speed than LFQC, which was previously the top-performing compression algorithm for the LS454 dataset. LFastqC is freely available at https://github.uconn.edu/sya12005/LFastqC.

Suggested Citation

  • Sultan Al Yami & Chun-Hsi Huang, 2019. "LFastqC: A lossless non-reference-based FASTQ compressor," PLOS ONE, Public Library of Science, vol. 14(11), pages 1-10, November.
  • Handle: RePEc:plo:pone00:0224806
    DOI: 10.1371/journal.pone.0224806
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0224806
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0224806&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0224806?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Armando J Pinho & Paulo J S G Ferreira & António J R Neves & Carlos A C Bastos, 2011. "On the Representability of Complete Genomes by Multiple Competing Finite-Context (Markov) Models," PLOS ONE, Public Library of Science, vol. 6(6), pages 1-7, June.
    2. Pinghao Li & Shuang Wang & Jihoon Kim & Hongkai Xiong & Lucila Ohno-Machado & Xiaoqian Jiang, 2013. "DNA-COMPACT: DNA COMpression Based on a Pattern-Aware Contextual Modeling Technique," PLOS ONE, Public Library of Science, vol. 8(11), pages 1-13, November.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.

      More about this item

      Statistics

      Access and download statistics

      Corrections

      All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0224806. See general information about how to correct material in RePEc.

      If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

      If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

      If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

      For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

      Please note that corrections may take a couple of weeks to filter through the various RePEc services.

      IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.