IDEAS home Printed from https://ideas.repec.org/a/ibn/jmrjnl/v11y2019i2p171.html
   My bibliography  Save this article

Recursive Formula for the Random String Word Detection Probability, Overlaps and Probability Extremes

Author

Listed:
  • V. I. Ilyevsky

Abstract

In this paper, for the first time ever, the properties of the word detection probability in a random string have been investigated. The formerly known methods led to numerical evaluation of the researched probabilities only. The present work derives the simplest algorithm for calculation of the word’s at least once detection probability in a random string. A recursive formula that considers the overlap capability has been deduced for the probability under study. This formula is being used for the proposition on comparison of the word detection probabilities in a random string for the words with different periods. The result allows determining the structure of words that have maximum and minimum detection probabilities. In particular, words having equal number of alphabetic characters have been studied. It has been established, that for the words in question detection probability is minimal for the ideally symmetrical words that have irreducible period - and maximal for the words devoid of the overlap feature. These results will be useful for molecular genetics, as well as for students studying discrete mathematics, probability theory and molecular biology.

Suggested Citation

  • V. I. Ilyevsky, 2019. "Recursive Formula for the Random String Word Detection Probability, Overlaps and Probability Extremes," Journal of Mathematics Research, Canadian Center of Science and Education, vol. 11(2), pages 171-180, April.
  • Handle: RePEc:ibn:jmrjnl:v:11:y:2019:i:2:p:171
    as

    Download full text from publisher

    File URL: http://www.ccsenet.org/journal/index.php/jmr/article/download/0/0/38908/39628
    Download Restriction: no

    File URL: http://www.ccsenet.org/journal/index.php/jmr/article/view/0/38908
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. S. Robin & J.-J. Daudin, 2001. "Exact Distribution of the Distances between Any Occurrences of a Set of Words," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 53(4), pages 895-905, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ana Helena Tavares & Jakob Raymaekers & Peter J. Rousseeuw & Paula Brito & Vera Afreixo, 2020. "Clustering genomic words in human DNA using peaks and trends of distributions," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 14(1), pages 57-76, March.
    2. Vladimir Pozdnyakov & Joseph Glaz & Martin Kulldorff & J. Steele, 2005. "A martingale approach to scan statistics," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 57(1), pages 21-37, March.
    3. Lian Chen & Xiujun Zhang, 2019. "Fault Tolerance and 2-Domination in Certain Interconnection Networks," Journal of Mathematics Research, Canadian Center of Science and Education, vol. 11(2), pages 181-189, April.
    4. Kiyoshi Inoue & Sigeo Aki, 2009. "On waiting time distributions associated with compound patterns in a sequence of multi-state trials," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 61(2), pages 499-516, June.
    5. Kiyoshi Inoue & Sigeo Aki, 2014. "On sooner and later waiting time distributions associated with simple patterns in a sequence of bivariate trials," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 77(7), pages 895-920, October.
    6. Yonil Park & John L. Spouge, 2004. "Searching for Multiple Words in a Markov Sequence," INFORMS Journal on Computing, INFORMS, vol. 16(4), pages 341-347, November.
    7. Han, Qing & Hirano, Katuomi, 2003. "Waiting time problem for an almost perfect match," Statistics & Probability Letters, Elsevier, vol. 65(1), pages 39-49, October.

    More about this item

    Keywords

    word occurrence; probability; combinatorics; overlaps; probability extremes;
    All these keywords.

    JEL classification:

    • R00 - Urban, Rural, Regional, Real Estate, and Transportation Economics - - General - - - General
    • Z0 - Other Special Topics - - General

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ibn:jmrjnl:v:11:y:2019:i:2:p:171. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Canadian Center of Science and Education (email available below). General contact details of provider: https://edirc.repec.org/data/cepflch.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.