Author
Listed:
- Henri Christian Junior Tsoungui Obama
- Kristan Alexander Schneider
Abstract
Background: Molecular/genetic methods are becoming increasingly important for surveillance of diseases like malaria. Such methods allow monitoring routes of disease transmission or the origin and spread of variants associated with drug resistance. A confounding factor in molecular disease surveillance is the presence of multiple distinct variants in the same infection (multiplicity of infection – MOI), which leads to ambiguity when reconstructing which pathogenic variants are present in an infection. Heuristic approaches often ignore ambiguous infections, which leads to biased results. Methods: To avoid bias, we introduce a statistical framework to estimate haplotype frequencies alongside MOI from a pair of multi-allelic molecular markers. Estimates are based on maximum likelihood using the expectation-maximization (EM)-algorithm. The estimates can be used as plug-ins to construct pairwise linkage disequilibrium (LD) maps. The finite-sample properties of the proposed method are studied by systematic numerical simulations. These reveal that the EM-algorithm is a numerically stable method in our case and that the proposed method is accurate (little bias) and precise (small variance) for a reasonable sample size. In fact, the results suggest that the estimator is asymptotically unbiased. Furthermore, the method is appropriate to estimate LD (by D′, r2, Q*, or conditional asymmetric LD). Furthermore, as an illustration, we apply the new method to a previously published dataset from Cameroon concerning sulfadoxine-pyrimethamine (SP) resistance. The results are in accordance with the SP drug pressure at the time and the observed spread of resistance in the country, yielding further evidence for the adequacy of the proposed method. Conclusion: The proposed method can be readily applied in practice for malaria disease surveillance as a replacement for heuristic methods. The first benefit is its ability to estimate MOI, which scales with transmission intensities, and, in a temporal context, can be used to evaluate the effectiveness of disease control measures. MOI is best estimated from molecular markers that are not under selection (neutral markers) and exhibit sufficient genetic variation. The second advantage is that it can estimate pairwise LD without deflating sample size as in heuristic methods, thereby limiting uncertainty in the estimates. This is particularly useful when deriving LD maps from data with many ambiguous observations due to MOI. Importantly, the method per se is not restricted to malaria, but applicable to any disease with a similar transmission pattern. The method and several extensions are implemented in an easy-to-use R script.
Suggested Citation
Henri Christian Junior Tsoungui Obama & Kristan Alexander Schneider, 2025.
"Estimating multiplicity of infection, haplotype frequencies, and linkage disequilibria from multi-allelic markers for molecular disease surveillance,"
PLOS ONE, Public Library of Science, vol. 20(5), pages 1-30, May.
Handle:
RePEc:plo:pone00:0321723
DOI: 10.1371/journal.pone.0321723
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0321723. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.