IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0077885.html
   My bibliography  Save this article

Optimal Scaling of Digital Transcriptomes

Author

Listed:
  • Gustavo Glusman
  • Juan Caballero
  • Max Robinson
  • Burak Kutlu
  • Leroy Hood

Abstract

Deep sequencing of transcriptomes has become an indispensable tool for biology, enabling expression levels for thousands of genes to be compared across multiple samples. Since transcript counts scale with sequencing depth, counts from different samples must be normalized to a common scale prior to comparison. We analyzed fifteen existing and novel algorithms for normalizing transcript counts, and evaluated the effectiveness of the resulting normalizations. For this purpose we defined two novel and mutually independent metrics: (1) the number of “uniform” genes (genes whose normalized expression levels have a sufficiently low coefficient of variation), and (2) low Spearman correlation between normalized expression profiles of gene pairs. We also define four novel algorithms, one of which explicitly maximizes the number of uniform genes, and compared the performance of all fifteen algorithms. The two most commonly used methods (scaling to a fixed total value, or equalizing the expression of certain ‘housekeeping’ genes) yielded particularly poor results, surpassed even by normalization based on randomly selected gene sets. Conversely, seven of the algorithms approached what appears to be optimal normalization. Three of these algorithms rely on the identification of “ubiquitous” genes: genes expressed in all the samples studied, but never at very high or very low levels. We demonstrate that these include a “core” of genes expressed in many tissues in a mutually consistent pattern, which is suitable for use as an internal normalization guide. The new methods yield robustly normalized expression values, which is a prerequisite for the identification of differentially expressed and tissue-specific genes as potential biomarkers.

Suggested Citation

  • Gustavo Glusman & Juan Caballero & Max Robinson & Burak Kutlu & Leroy Hood, 2013. "Optimal Scaling of Digital Transcriptomes," PLOS ONE, Public Library of Science, vol. 8(11), pages 1-12, November.
  • Handle: RePEc:plo:pone00:0077885
    DOI: 10.1371/journal.pone.0077885
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0077885
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0077885&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0077885?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Eric T. Wang & Rickard Sandberg & Shujun Luo & Irina Khrebtukova & Lu Zhang & Christine Mayr & Stephen F. Kingsmore & Gary P. Schroth & Christopher B. Burge, 2008. "Alternative isoform regulation in human tissue transcriptomes," Nature, Nature, vol. 456(7221), pages 470-476, November.
    2. Alina Sîrbu & Heather J Ruskin & Martin Crane, 2010. "Cross-Platform Microarray Data Normalisation for Regulatory Network Inference," PLOS ONE, Public Library of Science, vol. 5(11), pages 1-13, November.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xiaohong Li & Guy N Brock & Eric C Rouchka & Nigel G F Cooper & Dongfeng Wu & Timothy E O’Toole & Ryan S Gill & Abdallah M Eteleeb & Liz O’Brien & Shesh N Rai, 2017. "A comparison of per sample global scaling and per gene normalization methods for differential expression analysis of RNA-seq data," PLOS ONE, Public Library of Science, vol. 12(5), pages 1-22, May.
    2. repec:plo:pone00:0018135 is not listed on IDEAS
    3. Jun Inamo & Akari Suzuki & Mahoko Takahashi Ueda & Kensuke Yamaguchi & Hiroshi Nishida & Katsuya Suzuki & Yuko Kaneko & Tsutomu Takeuchi & Hiroaki Hatano & Kazuyoshi Ishigaki & Yasushi Ishihama & Kazu, 2024. "Long-read sequencing for 29 immune cell subsets reveals disease-linked isoforms," Nature Communications, Nature, vol. 15(1), pages 1-19, December.
    4. Yvonne L. Chao & Katherine I. Zhou & Kwame K. Forbes & Alessandro Porrello & Gabrielle M. Gentile & Yinzhou Zhu & Aaron C. Chack & Dixcy J. S. John Mary & Haizhou Liu & Eric Cockman & Lincy Edatt & Gr, 2025. "Snord67 promotes breast cancer metastasis by guiding U6 modification and modulating the splicing landscape," Nature Communications, Nature, vol. 16(1), pages 1-23, December.
    5. Areum Han & Peter Stoilov & Anthony J Linares & Yu Zhou & Xiang-Dong Fu & Douglas L Black, 2014. "De Novo Prediction of PTBP1 Binding and Splicing Targets Reveals Unexpected Features of Its RNA Recognition and Function," PLOS Computational Biology, Public Library of Science, vol. 10(1), pages 1-18, January.
    6. Judith A Potashkin & Jose A Santiago & Bernard M Ravina & Arthur Watts & Alexey A Leontovich, 2012. "Biosignatures for Parkinson’s Disease and Atypical Parkinsonian Disorders Patients," PLOS ONE, Public Library of Science, vol. 7(8), pages 1-13, August.
    7. Wei Hu & Yangjun Wu & Qili Shi & Jingni Wu & Deping Kong & Xiaohua Wu & Xianghuo He & Teng Liu & Shengli Li, 2022. "Systematic characterization of cancer transcriptome at transcript resolution," Nature Communications, Nature, vol. 13(1), pages 1-16, December.
    8. Jianfei Hu & Eli Boritz & William Wylie & Daniel C Douek, 2017. "Stochastic principles governing alternative splicing of RNA," PLOS Computational Biology, Public Library of Science, vol. 13(9), pages 1-20, September.
    9. Hillary M. Heiling & Douglas R. Wilson & Naim U. Rashid & Wei Sun & Joseph G. Ibrahim, 2023. "Estimating cell type composition using isoform expression one gene at a time," Biometrics, The International Biometric Society, vol. 79(2), pages 854-865, June.
    10. Mathieu Charles & Nicolas Gaiani & Marie-Pierre Sanchez & Mekki Boussaha & Chris Hozé & Didier Boichard & Dominique Rocha & Arnaud Boulling, 2025. "Functional impact of splicing variants in the elaboration of complex traits in cattle," Nature Communications, Nature, vol. 16(1), pages 1-20, December.
    11. Seungjae Lee & Yen-Chung Chen & Austin E. Gillen & J. Matthew Taliaferro & Bart Deplancke & Hongjie Li & Eric C. Lai, 2022. "Diverse cell-specific patterns of alternative polyadenylation in Drosophila," Nature Communications, Nature, vol. 13(1), pages 1-16, December.
    12. Wei Sun & Yufeng Liu & James J. Crowley & Ting-Huei Chen & Hua Zhou & Haitao Chu & Shunping Huang & Pei-Fen Kuan & Yuan Li & Darla Miller & Ginger Shaw & Yichao Wu & Vasyl Zhabotynsky & Leonard McMill, 2015. "IsoDOT Detects Differential RNA-Isoform Expression/Usage With Respect to a Categorical or Continuous Covariate With High Sensitivity and Specificity," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(511), pages 975-986, September.
    13. Justin Bo-Kai Hsu & Neil Arvin Bretaña & Tzong-Yi Lee & Hsien-Da Huang, 2011. "Incorporating Evolutionary Information and Functional Domains for Identifying RNA Splicing Factors in Humans," PLOS ONE, Public Library of Science, vol. 6(11), pages 1-11, November.
    14. Stacey D Wagner & Adam J Struck & Riti Gupta & Dylan R Farnsworth & Amy E Mahady & Katy Eichinger & Charles A Thornton & Eric T Wang & J Andrew Berglund, 2016. "Dose-Dependent Regulation of Alternative Splicing by MBNL Proteins Reveals Biomarkers for Myotonic Dystrophy," PLOS Genetics, Public Library of Science, vol. 12(9), pages 1-24, September.
    15. Christopher G Bell & Sarah Finer & Cecilia M Lindgren & Gareth A Wilson & Vardhman K Rakyan & Andrew E Teschendorff & Pelin Akan & Elia Stupka & Thomas A Down & Inga Prokopenko & Ian M Morison & Jonat, 2010. "Integrated Genetic and Epigenetic Analysis Identifies Haplotype-Specific Methylation in the FTO Type 2 Diabetes and Obesity Susceptibility Locus," PLOS ONE, Public Library of Science, vol. 5(11), pages 1-12, November.
    16. Huihui Liu & Hongchao Liu & Longhao Wang & Lei Song & Guixian Jiang & Qing Lu & Tao Yang & Hu Peng & Ruijie Cai & Xingle Zhao & Ting Zhao & Hao Wu, 2023. "Cochlear transcript diversity and its role in auditory functions implied by an otoferlin short isoform," Nature Communications, Nature, vol. 14(1), pages 1-19, December.
    17. Alexis Weinreb & Erdem Varol & Alec Barrett & Rebecca M. McWhirter & Seth R. Taylor & Isabel Courtney & Manasa Basavaraju & Abigail Poff & John A. Tipps & Becca Collings & Smita Krishnaswamy & David M, 2025. "Alternative splicing across the C. elegans nervous system," Nature Communications, Nature, vol. 16(1), pages 1-21, December.
    18. Irika R. Sinha & Parker S. Sandal & Holly Spence & Grace D. Burns & Aswathy Peethambaran Mallika & Fatemeh Abbasinejad & Katherine E. Irwin & Anna Lourdes F. Cruz & Vania Wang & Shaelyn R. Marx & Josu, 2025. "Large-scale RNA-Seq mining reveals ciclopirox olamine induces TDP-43 cryptic exons," Nature Communications, Nature, vol. 16(1), pages 1-15, December.
    19. Saikat Bhattacharya & Suman Wang & Divya Reddy & Siyuan Shen & Ying Zhang & Ning Zhang & Hua Li & Michael P. Washburn & Laurence Florens & Yunyu Shi & Jerry L. Workman & Fudong Li, 2021. "Structural basis of the interaction between SETD2 methyltransferase and hnRNP L paralogs for governing co-transcriptional splicing," Nature Communications, Nature, vol. 12(1), pages 1-15, December.
    20. Craig I. Dent & Stefan Prodic & Aiswarya Balakrishnan & Aaryan Chhabra & James D. G. Georges & Sourav Mukherjee & Jordyn Coutts & Michael Gitonobel & Rucha D. Sarwade & Joseph Rosenbluh & Mauro D’Amat, 2025. "A basic framework to explain splice-site choice in eukaryotes," Nature Communications, Nature, vol. 16(1), pages 1-14, December.
    21. Yaqi Su & Zhejian Yu & Siqian Jin & Zhipeng Ai & Ruihong Yuan & Xinyi Chen & Ziwei Xue & Yixin Guo & Di Chen & Hongqing Liang & Zuozhu Liu & Wanlu Liu, 2024. "Comprehensive assessment of mRNA isoform detection methods for long-read sequencing data," Nature Communications, Nature, vol. 15(1), pages 1-19, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0077885. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.