IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0024085.html
   My bibliography  Save this article

An Exhaustive, Non-Euclidean, Non-Parametric Data Mining Tool for Unraveling the Complexity of Biological Systems – Novel Insights into Malaria

Author

Listed:
  • Cheikh Loucoubar
  • Richard Paul
  • Avner Bar-Hen
  • Augustin Huret
  • Adama Tall
  • Cheikh Sokhna
  • Jean-François Trape
  • Alioune Badara Ly
  • Joseph Faye
  • Abdoulaye Badiane
  • Gaoussou Diakhaby
  • Fatoumata Diène Sarr
  • Aliou Diop
  • Anavaj Sakuntabhai
  • Jean-François Bureau

Abstract

Complex, high-dimensional data sets pose significant analytical challenges in the post-genomic era. Such data sets are not exclusive to genetic analyses and are also pertinent to epidemiology. There has been considerable effort to develop hypothesis-free data mining and machine learning methodologies. However, current methodologies lack exhaustivity and general applicability. Here we use a novel non-parametric, non-euclidean data mining tool, HyperCube®, to explore exhaustively a complex epidemiological malaria data set by searching for over density of events in m-dimensional space. Hotspots of over density correspond to strings of variables, rules, that determine, in this case, the occurrence of Plasmodium falciparum clinical malaria episodes. The data set contained 46,837 outcome events from 1,653 individuals and 34 explanatory variables. The best predictive rule contained 1,689 events from 148 individuals and was defined as: individuals present during 1992–2003, aged 1–5 years old, having hemoglobin AA, and having had previous Plasmodium malariae malaria parasite infection ≤10 times. These individuals had 3.71 times more P. falciparum clinical malaria episodes than the general population. We validated the rule in two different cohorts. We compared and contrasted the HyperCube® rule with the rules using variables identified by both traditional statistical methods and non-parametric regression tree methods. In addition, we tried all possible sub-stratified quantitative variables. No other model with equal or greater representativity gave a higher Relative Risk. Although three of the four variables in the rule were intuitive, the effect of number of P. malariae episodes was not. HyperCube® efficiently sub-stratified quantitative variables to optimize the rule and was able to identify interactions among the variables, tasks not easy to perform using standard data mining methods. Search of local over density in m-dimensional space, explained by easily interpretable rules, is thus seemingly ideal for generating hypotheses for large datasets to unravel the complexity inherent in biological systems.

Suggested Citation

  • Cheikh Loucoubar & Richard Paul & Avner Bar-Hen & Augustin Huret & Adama Tall & Cheikh Sokhna & Jean-François Trape & Alioune Badara Ly & Joseph Faye & Abdoulaye Badiane & Gaoussou Diakhaby & Fatoumat, 2011. "An Exhaustive, Non-Euclidean, Non-Parametric Data Mining Tool for Unraveling the Complexity of Biological Systems – Novel Insights into Malaria," PLOS ONE, Public Library of Science, vol. 6(9), pages 1-16, September.
  • Handle: RePEc:plo:pone00:0024085
    DOI: 10.1371/journal.pone.0024085
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0024085
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0024085&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0024085?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Armin Falk & David Huffman & W. Bentley Macleod, 2015. "Institutions and Contract Enforcement," Journal of Labor Economics, University of Chicago Press, vol. 33(3), pages 571-590.
    2. Jellema, Jon & Roland, Gerard, 2011. "Institutional clusters and economic performance," Journal of Economic Behavior & Organization, Elsevier, vol. 79(1), pages 108-132.
    3. Erlend Nier & Luis Ignacio Jácome & Jacek Osinski & Pamela Madrid, 2011. "Institutional Models for Macroprudential Policy," IMF Staff Discussion Notes 11/18, International Monetary Fund.
    4. Wang, Lanlan & Gordon, Peter, 2011. "Trust and institutions: A multilevel analysis," Journal of Behavioral and Experimental Economics (formerly The Journal of Socio-Economics), Elsevier, vol. 40(5), pages 583-593.
    5. Lewis Davis & Mark Hopkins, 2011. "The Institutional Foundations of Inequality and Growth," Journal of Development Studies, Taylor & Francis Journals, vol. 47(7), pages 977-997.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Marion Payen & Patrick Rondé, 2020. "Culture, Institutions and Economic Growth," Working Papers of BETA 2020-18, Bureau d'Economie Théorique et Appliquée, UDS, Strasbourg.
    2. Englmaier, Florian & Segal, Carmit, 2016. "Morale, Relationships, and Wages: An Experimental Study," VfS Annual Conference 2016 (Augsburg): Demographic Change 145662, Verein für Socialpolitik / German Economic Association.
    3. Cheng Hoon Lim & Mr. Rishi S Ramchand & Mrs. Helen W Wagner & Mr. Xiaoyong Wu, 2013. "Institutional Arrangements for Macroprudential Policy in Asia," IMF Working Papers 2013/165, International Monetary Fund.
    4. W. Bentley MacLeod & James M. Malcomson, 2023. "Implicit Contracts, Incentive Compatibility, and Involuntary Unemployment: Thirty Years On," Journal of Institutional and Theoretical Economics (JITE), Mohr Siebeck, Tübingen, vol. 179(3-4), pages 470-499.
    5. van Hoorn, André & Maseland, Robbert, 2013. "Does a Protestant work ethic exist? Evidence from the well-being effect of unemployment," Journal of Economic Behavior & Organization, Elsevier, vol. 91(C), pages 1-12.
    6. Pastor, Manuel & Wise, Carol, 2015. "Good-Bye financial crash, hello financial eclecticism: Latin American responses to the 2008–09 global financial crisis," Journal of International Money and Finance, Elsevier, vol. 52(C), pages 200-217.
    7. Brandts, Jordi & Corgnet, Brice & Hernán-González, Roberto & Ortiz, José Mª & Solà, Carles, 2021. "Watching or not watching? Access to information and the incentive effects of firing threats," Journal of Economic Behavior & Organization, Elsevier, vol. 189(C), pages 672-685.
    8. Ding, Zhujun & Au, Kevin & Chiang, Flora, 2015. "Social trust and angel investors' decisions: A multilevel analysis across nations," Journal of Business Venturing, Elsevier, vol. 30(2), pages 307-321.
    9. Jetter, Michael & Kristoffersen, Ingebjørg, 2018. "Financial shocks and the erosion of interpersonal trust: Evidence from longitudinal data," Journal of Economic Psychology, Elsevier, vol. 67(C), pages 162-176.
    10. Afrifa, Godfred Adjapong & Tingbani, Ishmael & Yamoah, Fred & Appiah, Gloria, 2020. "Innovation input, governance and climate change: Evidence from emerging countries," Technological Forecasting and Social Change, Elsevier, vol. 161(C).
    11. Elwyn Davies & Marcel Fafchamps, 2017. "When No Bad Deed Goes Punished: Relational Contracting in Ghana versus the UK," NBER Working Papers 23123, National Bureau of Economic Research, Inc.
    12. Charness, Gary & Kuhn, Peter, 2011. "Lab Labor: What Can Labor Economists Learn from the Lab?," Handbook of Labor Economics, in: O. Ashenfelter & D. Card (ed.), Handbook of Labor Economics, edition 1, volume 4, chapter 3, pages 229-330, Elsevier.
    13. Francesco Caracciolo & Fabio Santeramo, 2013. "Price Trends and Income Inequalities: Will Sub-Saharan Africa Reduce the Gap?," African Development Review, African Development Bank, vol. 25(1), pages 42-54.
    14. Davis, Lewis S. & Knauss, Matthew, 2013. "The moral consequences of economic growth: An empirical investigation," Journal of Behavioral and Experimental Economics (formerly The Journal of Socio-Economics), Elsevier, vol. 42(C), pages 43-50.
    15. Johannes Abeler & Steffen Altmann & Sebastian Kube & Matthias Wibral, 2010. "Gift Exchange and Workers' Fairness Concerns: When Equality is Unfair," Journal of the European Economic Association, MIT Press, vol. 8(6), pages 1299-1324, December.
    16. Driffield, Nigel L. & Mickiewicz, Tomasz & Temouri, Yama, 2013. "Institutional reforms, productivity and profitability: From rents to competition?," Journal of Comparative Economics, Elsevier, vol. 41(2), pages 583-600.
    17. Revkin, Mara Redlich & Ahram, Ariel I., 2020. "Perspectives on the rebel social contract: Exit, voice, and loyalty in the Islamic State in Iraq and Syria," World Development, Elsevier, vol. 132(C).
    18. Martin Fochmann & Björn Jahnke & Andreas Wagener, 2019. "Does the reliability of institutions affect public good contributions? Evidence from a laboratory experiment," Scottish Journal of Political Economy, Scottish Economic Society, vol. 66(3), pages 434-458, July.
    19. Grosso, Monica & Castaldo, Sandro & Li, Hua (Ariel) & Larivière, Bart, 2020. "What Information Do Shoppers Share? The Effect of Personnel-, Retailer-, and Country-Trust on Willingness to Share Information," Journal of Retailing, Elsevier, vol. 96(4), pages 524-547.
    20. Blanco, Luisa R., 2013. "The impact of crime on trust in institutions in Mexico," European Journal of Political Economy, Elsevier, vol. 32(C), pages 38-55.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0024085. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.