IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1011345.html
   My bibliography  Save this article

Beam search decoder for enhancing sequence decoding speed in single-molecule peptide sequencing data

Author

Listed:
  • Javier Kipen
  • Joakim Jaldén

Abstract

Next-generation single-molecule protein sequencing technologies have the potential to significantly accelerate biomedical research. These technologies offer sensitivity and scalability for proteomic analysis. One auspicious method is fluorosequencing, which involves: cutting naturalized proteins into peptides, attaching fluorophores to specific amino acids, and observing variations in light intensity as one amino acid is removed at a time. The original peptide is classified from the sequence of light-intensity reads, and proteins can subsequently be recognized with this information. The amino acid step removal is achieved by attaching the peptides to a wall on the C-terminal and using a process called Edman Degradation to remove an amino acid from the N-Terminal. Even though a framework (Whatprot) has been proposed for the peptide classification task, processing times remain restrictive due to the massively parallel data acquisicion system. In this paper, we propose a new beam search decoder with a novel state formulation that obtains considerably lower processing times at the expense of only a slight accuracy drop compared to Whatprot. Furthermore, we explore how our novel state formulation may lead to even faster decoders in the future.Author summary: Proteomic analyses frequently rely on mass spectrometry, a method characterized by its limited dynamic range, potentially overlooking low-abundant proteins. To address this limitation, single-molecule protein sequencing methods offer a solution. Fluorosequencing is a cutting-edge single-molecule protein sequencing method, which can distinguish peptides or protein molecules massively parallelly. This method has attracted interest from investors, as evidenced by the recent funding of Erisyon, a company developing this technology. This technique contains a challenging classification task: determining the original peptide sequence from light-intensity observations obtained after several Edman cycles. A classifier based on a combination of k Nearest Neighbors (kNN) with Hidden Markov Models (HMM) had been shown to have close-to-optimal accuracy with tractable complexity. We propose in this paper a new algorithm that reduces computation time significantly at the expense of a slight reduction in accuracy compared to state-of-the-art method.

Suggested Citation

  • Javier Kipen & Joakim Jaldén, 2023. "Beam search decoder for enhancing sequence decoding speed in single-molecule peptide sequencing data," PLOS Computational Biology, Public Library of Science, vol. 19(11), pages 1-21, November.
  • Handle: RePEc:plo:pcbi00:1011345
    DOI: 10.1371/journal.pcbi.1011345
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1011345
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1011345&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1011345?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Michael Eisenstein, 2023. "Seven technologies to watch in 2023," Nature, Nature, vol. 613(7945), pages 794-797, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Wenjun Zhang & Feifei Xu & Jiang Yao & Changfei Mao & Mingchen Zhu & Moting Qian & Jun Hu & Huilin Zhong & Junsheng Zhou & Xiaoyu Shi & Yun Chen, 2023. "Single-cell metabolic fingerprints discover a cluster of circulating tumor cells with distinct metastatic potential," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    2. Xia Li & Feng Qiao & Jiansheng Guo & Ting Jiang & Huifang Lou & Huixia Li & Gangcai Xie & Hangjun Wu & Weizhen Wang & Ruoyu Pei & Sha Liu & Mei Ye & Jin Li & Shiqin Huang & Mengya Zhang & Chaoye Ma & , 2025. "In situ architecture of the intercellular organelle reservoir between epididymal epithelial cells by volume electron microscopy," Nature Communications, Nature, vol. 16(1), pages 1-15, December.
    3. Jinyong Chen & Tanchen Ren & Lan Xie & Haochang Hu & Xu Li & Miribani Maitusong & Xuhao Zhou & Wangxing Hu & Dilin Xu & Yi Qian & Si Cheng & Kaixiang Yu & Jian`an Wang & Xianbao Liu, 2024. "Enhancing aortic valve drug delivery with PAR2-targeting magnetic nano-cargoes for calcification alleviation," Nature Communications, Nature, vol. 15(1), pages 1-20, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1011345. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.