IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0191105.html
   My bibliography  Save this article

A hybrid gene selection approach to create the S1500+ targeted gene sets for use in high-throughput transcriptomics

Author

Listed:
  • Deepak Mav
  • Ruchir R Shah
  • Brian E Howard
  • Scott S Auerbach
  • Pierre R Bushel
  • Jennifer B Collins
  • David L Gerhold
  • Richard S Judson
  • Agnes L Karmaus
  • Elizabeth A Maull
  • Donna L Mendrick
  • B Alex Merrick
  • Nisha S Sipes
  • Daniel Svoboda
  • Richard S Paules

Abstract

Changes in gene expression can help reveal the mechanisms of disease processes and the mode of action for toxicities and adverse effects on cellular responses induced by exposures to chemicals, drugs and environment agents. The U.S. Tox21 Federal collaboration, which currently quantifies the biological effects of nearly 10,000 chemicals via quantitative high-throughput screening(qHTS) in in vitro model systems, is now making an effort to incorporate gene expression profiling into the existing battery of assays. Whole transcriptome analyses performed on large numbers of samples using microarrays or RNA-Seq is currently cost-prohibitive. Accordingly, the Tox21 Program is pursuing a high-throughput transcriptomics (HTT) method that focuses on the targeted detection of gene expression for a carefully selected subset of the transcriptome that potentially can reduce the cost by a factor of 10-fold, allowing for the analysis of larger numbers of samples. To identify the optimal transcriptome subset, genes were sought that are (1) representative of the highly diverse biological space, (2) capable of serving as a proxy for expression changes in unmeasured genes, and (3) sufficient to provide coverage of well described biological pathways. A hybrid method for gene selection is presented herein that combines data-driven and knowledge-driven concepts into one cohesive method. Our approach is modular, applicable to any species, and facilitates a robust, quantitative evaluation of performance. In particular, we were able to perform gene selection such that the resulting set of “sentinel genes” adequately represents all known canonical pathways from Molecular Signature Database (MSigDB v4.0) and can be used to infer expression changes for the remainder of the transcriptome. The resulting computational model allowed us to choose a purely data-driven subset of 1500 sentinel genes, referred to as the S1500 set, which was then augmented using a knowledge-driven selection of additional genes to create the final S1500+ gene set. Our results indicate that the sentinel genes selected can be used to accurately predict pathway perturbations and biological relationships for samples under study.

Suggested Citation

  • Deepak Mav & Ruchir R Shah & Brian E Howard & Scott S Auerbach & Pierre R Bushel & Jennifer B Collins & David L Gerhold & Richard S Judson & Agnes L Karmaus & Elizabeth A Maull & Donna L Mendrick & B , 2018. "A hybrid gene selection approach to create the S1500+ targeted gene sets for use in high-throughput transcriptomics," PLOS ONE, Public Library of Science, vol. 13(2), pages 1-19, February.
  • Handle: RePEc:plo:pone00:0191105
    DOI: 10.1371/journal.pone.0191105
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0191105
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0191105&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0191105?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0191105. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.