IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1011413.html
   My bibliography  Save this article

STREAK: A supervised cell surface receptor abundance estimation strategy for single cell RNA-sequencing data using feature selection and thresholded gene set scoring

Author

Listed:
  • Azka Javaid
  • Hildreth Robert Frost

Abstract

The accurate estimation of cell surface receptor abundance for single cell transcriptomics data is important for the tasks of cell type and phenotype categorization and cell-cell interaction quantification. We previously developed an unsupervised receptor abundance estimation technique named SPECK (Surface Protein abundance Estimation using CKmeans-based clustered thresholding) to address the challenges associated with accurate abundance estimation. In that paper, we concluded that SPECK results in improved concordance with Cellular Indexing of Transcriptomes and Epitopes by Sequencing (CITE-seq) data relative to comparative unsupervised abundance estimation techniques using only single-cell RNA-sequencing (scRNA-seq) data. In this paper, we outline a new supervised receptor abundance estimation method called STREAK (gene Set Testing-based Receptor abundance Estimation using Adjusted distances and cKmeans thresholding) that leverages associations learned from joint scRNA-seq/CITE-seq training data and a thresholded gene set scoring mechanism to estimate receptor abundance for scRNA-seq target data. We evaluate STREAK relative to both unsupervised and supervised receptor abundance estimation techniques using two evaluation approaches on six joint scRNA-seq/CITE-seq datasets that represent four human and mouse tissue types. We conclude that STREAK outperforms other abundance estimation strategies and provides a more biologically interpretable and transparent statistical model.Author summary: Herein, we present an overview of our recently developed supervised receptor abundance estimation technique, STREAK (gene Set Testing-based Receptor abundance Estimation using Adjusted distances and cKmeans thresholding), which leverages co-expression associations learned from joint scRNA-seq/CITE-seq data to perform approximate abundance estimation. More specifically, STREAK functions by utilizing these expression associations to develop weighted membership gene sets, which are next thresholded following a gene set scoring procedure. These thresholded scores are set to the estimated abundance profiles.

Suggested Citation

  • Azka Javaid & Hildreth Robert Frost, 2023. "STREAK: A supervised cell surface receptor abundance estimation strategy for single cell RNA-sequencing data using feature selection and thresholded gene set scoring," PLOS Computational Biology, Public Library of Science, vol. 19(8), pages 1-24, August.
  • Handle: RePEc:plo:pcbi00:1011413
    DOI: 10.1371/journal.pcbi.1011413
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1011413
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1011413&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1011413?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Grace X. Y. Zheng & Jessica M. Terry & Phillip Belgrader & Paul Ryvkin & Zachary W. Bent & Ryan Wilson & Solongo B. Ziraldo & Tobias D. Wheeler & Geoff P. McDermott & Junjie Zhu & Mark T. Gregory & Jo, 2017. "Massively parallel digital transcriptional profiling of single cells," Nature Communications, Nature, vol. 8(1), pages 1-12, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yu Hu & Kai Wang & Mingyao Li, 2020. "Detecting differential alternative splicing events in scRNA-seq with or without Unique Molecular Identifiers," PLOS Computational Biology, Public Library of Science, vol. 16(6), pages 1-19, June.
    2. Snehalika Lall & Sumanta Ray & Sanghamitra Bandyopadhyay, 2022. "A copula based topology preserving graph convolution network for clustering of single-cell RNA-seq data," PLOS Computational Biology, Public Library of Science, vol. 18(3), pages 1-16, March.
    3. Qunlun Shen & Shihua Zhang, 2021. "Approximate distance correlation for selecting highly interrelated genes across datasets," PLOS Computational Biology, Public Library of Science, vol. 17(11), pages 1-18, November.
    4. Jinzhou Li & Marloes H. Maathuis, 2021. "GGM knockoff filter: False discovery rate control for Gaussian graphical models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(3), pages 534-558, July.
    5. Lin Lin & Wei Shi & Jianbo Ye & Jia Li, 2023. "Multisource single‐cell data integration by MAW barycenter for Gaussian mixture models," Biometrics, The International Biometric Society, vol. 79(2), pages 866-877, June.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1011413. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.