IDEAS home Printed from https://ideas.repec.org/a/eee/phsmap/v249y1998i1p449-459.html
   My bibliography  Save this article

Interpreting correlations in biosequences

Author

Listed:
  • Herzel, H
  • Trifonov, E.N
  • Weiss, O
  • Große, I

Abstract

Understanding the complex organization of genomes as well as predicting the location of genes and the possible structure of the gene products are some of the most important problems in current molecular biology. Many statistical techniques are used to address these issues. A central role among them play correlation functions. This paper is based on an analysis of the decay of the entire 4×4 dimensional covariance matrix of DNA sequences. We apply this covariance analysis to human chromosomal regions, yeast DNA, and bacterial genomes and interpret the three most pronounced statistical features – long-range correlations, a period 3, and a period 10–11 – using known biological facts about the structure of genomes. For example, we relate the slowly decaying long-range G+C correlations to dispersed repeats and CpG islands. We show quantitatively that the 3-basepair-periodicity is due to the nonuniformity of the codon usage in protein coding segments. We finally show that periodicities of 10–11 basepairs in yeast DNA originate from an alternation of hydrophobic and hydrophilic amino acids in protein sequences.

Suggested Citation

  • Herzel, H & Trifonov, E.N & Weiss, O & Große, I, 1998. "Interpreting correlations in biosequences," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 249(1), pages 449-459.
  • Handle: RePEc:eee:phsmap:v:249:y:1998:i:1:p:449-459
    DOI: 10.1016/S0378-4371(97)00505-0
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0378437197005050
    Download Restriction: Full text for ScienceDirect subscribers only. Journal offers the option of making the article available online on Science direct for a fee of $3,000

    File URL: https://libkey.io/10.1016/S0378-4371(97)00505-0?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Zhang, Linxi & Sun, Tingting, 2005. "Statistical properties of nucleotides in human chromosomes 21 and 22," Chaos, Solitons & Fractals, Elsevier, vol. 23(3), pages 1077-1085.
    2. Buldyrev, Sergey V. & Dokholyan, Nikolay V. & Havlin, Shlomo & Stanley, H.Eugene & Stanley, Rachel H.R., 1999. "Expansion of tandem repeats and oligomer clustering in coding and noncoding DNA sequences," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 273(1), pages 19-32.
    3. Cheng, Jun & Zhang, Linxi, 2005. "Scaling behaviors of CG clusters for chromosomes," Chaos, Solitons & Fractals, Elsevier, vol. 25(2), pages 339-346.
    4. Licinio, P & Caligiorne, R.B, 2004. "Inference of phylogenetic distances from DNA-walk divergences," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 341(C), pages 471-481.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:phsmap:v:249:y:1998:i:1:p:449-459. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.journals.elsevier.com/physica-a-statistical-mechpplications/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.