IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1000258.html
   My bibliography  Save this article

Evolutionary Sequence Modeling for Discovery of Peptide Hormones

Author

Listed:
  • Kemal Sonmez
  • Naunihal T Zaveri
  • Ilan A Kerman
  • Sharon Burke
  • Charles R Neal
  • Xinmin Xie
  • Stanley J Watson
  • Lawrence Toll

Abstract

There are currently a large number of “orphan” G-protein-coupled receptors (GPCRs) whose endogenous ligands (peptide hormones) are unknown. Identification of these peptide hormones is a difficult and important problem. We describe a computational framework that models spatial structure along the genomic sequence simultaneously with the temporal evolutionary path structure across species and show how such models can be used to discover new functional molecules, in particular peptide hormones, via cross-genomic sequence comparisons. The computational framework incorporates a priori high-level knowledge of structural and evolutionary constraints into a hierarchical grammar of evolutionary probabilistic models. This computational method was used for identifying novel prohormones and the processed peptide sites by producing sequence alignments across many species at the functional-element level. Experimental results with an initial implementation of the algorithm were used to identify potential prohormones by comparing the human and non-human proteins in the Swiss-Prot database of known annotated proteins. In this proof of concept, we identified 45 out of 54 prohormones with only 44 false positives. The comparison of known and hypothetical human and mouse proteins resulted in the identification of a novel putative prohormone with at least four potential neuropeptides. Finally, in order to validate the computational methodology, we present the basic molecular biological characterization of the novel putative peptide hormone, including its identification and regional localization in the brain. This species comparison, HMM-based computational approach succeeded in identifying a previously undiscovered neuropeptide from whole genome protein sequences. This novel putative peptide hormone is found in discreet brain regions as well as other organs. The success of this approach will have a great impact on our understanding of GPCRs and associated pathways and help to identify new targets for drug development.Author Summary: Peptide hormones, or neuropeptides, are made up of a string of amino acids ranging from approximately 3 to 50 residues. These peptides are processed from a larger protein called a prohormone and activate a class of proteins called G-protein-coupled receptors (GPCRs). Neuropeptides signal neurons and other cells leading to changes in cellular biochemistry and potentially gene expression. There are a number of “orphan” GPCRs, i.e., receptors that have been discovered either by genomic sequence or by cloning, in which its respective peptide hormone is unknown. We have devised a computational method that models patterns in protein sequence simultaneously with evolutionary differences across species in order to identify previously unknown peptide hormones. We have used this computational methodology to identify a previously unknown putative prohormone that contains up to four potential neuropeptides, and we have characterized this prohormone with respect to location in rat brain and various human tissues. This computational technique will be useful for the identification of additional neuropeptides and help to characterize orphan GPCRs. Because roughly half of all pharmaceuticals act through activation or inhibition of GPCRs, this technique should lead to the identification of additional pharmaceutical targets and ultimately clinically used drugs.

Suggested Citation

  • Kemal Sonmez & Naunihal T Zaveri & Ilan A Kerman & Sharon Burke & Charles R Neal & Xinmin Xie & Stanley J Watson & Lawrence Toll, 2009. "Evolutionary Sequence Modeling for Discovery of Peptide Hormones," PLOS Computational Biology, Public Library of Science, vol. 5(1), pages 1-12, January.
  • Handle: RePEc:plo:pcbi00:1000258
    DOI: 10.1371/journal.pcbi.1000258
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1000258
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1000258&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1000258?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Manolis Kellis & Nick Patterson & Matthew Endrizzi & Bruce Birren & Eric S. Lander, 2003. "Sequencing and comparison of yeast species to identify genes and regulatory elements," Nature, Nature, vol. 423(6937), pages 241-254, May.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Tao Song & Hong Gu, 2014. "Discriminative Motif Discovery via Simulated Evolution and Random Under-Sampling," PLOS ONE, Public Library of Science, vol. 9(2), pages 1-10, February.
    2. Alexander Kawrykow & Gary Roumanis & Alfred Kam & Daniel Kwak & Clarence Leung & Chu Wu & Eleyine Zarour & Phylo players & Luis Sarmenta & Mathieu Blanchette & Jérôme Waldispühl, 2012. "Phylo: A Citizen Science Approach for Improving Multiple Sequence Alignment," PLOS ONE, Public Library of Science, vol. 7(3), pages 1-9, March.
    3. Alessandro L. V. Coradini & Christopher Ne Ville & Zachary A. Krieger & Joshua Roemer & Cara Hull & Shawn Yang & Daniel T. Lusk & Ian M. Ehrenreich, 2023. "Building synthetic chromosomes from natural DNA," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    4. Valerie Storms & Marleen Claeys & Aminael Sanchez & Bart De Moor & Annemieke Verstuyf & Kathleen Marchal, 2010. "The Effect of Orthology and Coregulation on Detecting Regulatory Motifs," PLOS ONE, Public Library of Science, vol. 5(2), pages 1-11, February.
    5. Robert K Bradley & Adam Roberts & Michael Smoot & Sudeep Juvekar & Jaeyoung Do & Colin Dewey & Ian Holmes & Lior Pachter, 2009. "Fast Statistical Alignment," PLOS Computational Biology, Public Library of Science, vol. 5(5), pages 1-15, May.
    6. Rahul Siddharthan & Eric D Siggia & Erik van Nimwegen, 2005. "PhyloGibbs: A Gibbs Sampling Motif Finder That Incorporates Phylogeny," PLOS Computational Biology, Public Library of Science, vol. 1(7), pages 1-23, December.
    7. Harri Lähdesmäki & Alistair G Rust & Ilya Shmulevich, 2008. "Probabilistic Inference of Transcription Factor Binding from Multiple Data Sources," PLOS ONE, Public Library of Science, vol. 3(3), pages 1-24, March.
    8. Leelavati Narlikar & Raluca Gordân & Alexander J Hartemink, 2007. "A Nucleosome-Guided Map of Transcription Factor Binding Sites in Yeast," PLOS Computational Biology, Public Library of Science, vol. 3(11), pages 1-10, November.
    9. J Roman Arguello & Carolina Sellanes & Yann Ru Lou & Robert A Raguso, 2013. "Can Yeast (S. cerevisiae) Metabolic Volatiles Provide Polymorphic Signaling?," PLOS ONE, Public Library of Science, vol. 8(8), pages 1-12, August.
    10. Fabio Pardi & Nick Goldman, 2005. "Species Choice for Comparative Genomics: Being Greedy Works," PLOS Genetics, Public Library of Science, vol. 1(6), pages 1-1, December.
    11. Krishna B. S. Swamy & Hsin-Yi Lee & Carmina Ladra & Chien-Fu Jeff Liu & Jung-Chi Chao & Yi-Yun Chen & Jun-Yi Leu, 2022. "Proteotoxicity caused by perturbed protein complexes underlies hybrid incompatibility in yeast," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    12. Eilon Sharon & Shai Lubliner & Eran Segal, 2008. "A Feature-Based Approach to Modeling Protein–DNA Interactions," PLOS Computational Biology, Public Library of Science, vol. 4(8), pages 1-17, August.
    13. Siewert Elizabeth A & Kechris Katerina J, 2009. "Prediction of Motifs Based on a Repeated-Measures Model for Integrating Cross-Species Sequence and Expression Data," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 8(1), pages 1-34, September.
    14. Lit-Hsin Loo & Danai Laksameethanasan & Yi-Ling Tung, 2014. "Quantitative Protein Localization Signatures Reveal an Association between Spatial and Functional Divergences of Proteins," PLOS Computational Biology, Public Library of Science, vol. 10(3), pages 1-17, March.
    15. Christian L Barrett & Bernhard O Palsson, 2006. "Iterative Reconstruction of Transcriptional Regulatory Networks: An Algorithmic Approach," PLOS Computational Biology, Public Library of Science, vol. 2(5), pages 1-10, May.
    16. Kenzie D MacIsaac & Ernest Fraenkel, 2006. "Practical Strategies for Discovering Regulatory DNA Sequence Motifs," PLOS Computational Biology, Public Library of Science, vol. 2(4), pages 1-10, April.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1000258. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.