Author
Listed:
- Esha Datta
- Aditya Ballal
- Javier E López
- Leighton T Izu
Abstract
One of the goals of precision medicine is to classify patients into subgroups that differ in their susceptibility and response to a disease, thereby enabling tailored treatments for each subgroup. Therefore, there is a great need to identify distinctive clusters of patients from patient data. There are three key challenges to three key challenges of patient stratification: 1) the unknown number of clusters, 2) the need for assessing cluster validity, and 3) the clinical interpretability. We developed MapperPlus, a novel unsupervised clustering pipeline, that directly addresses these challenges. It extends the topological Mapper technique and blends it with two random-walk algorithms to automatically detect disjoint subgroups in patient data. We demonstrate that MapperPlus outperforms traditional agnostic clustering methods in key accuracy/performance metrics by testing its performance on publicly available medical and non-medical data set. We also demonstrate the predictive power of MapperPlus in a medical dataset of pediatric stem cell transplant patients where a number of cluster is unknown. Here, MapperPlus stratifies the patient population into clusters with distinctive survival rates. The MapperPlus software is open-source and publicly available.Author summary: The era of precision medicine represents a unique and exciting opportunity in transforming the way we treat patients. With the immense availability of biomedical data and new computational techniques, we are more able than ever to understand what makes a patient unique. Indeed, even for a single condition, we can recognize that there are heterogeneities within the patient population. Understanding these differences can and should influence the way we treat patients. Key to this process is patient stratification, which is the division of patient populations into clinically meaningful subgroups. The goal of patient stratification is to capture the individuality of patients without becoming overly fine-grained. This is an exciting balancing act that engages both meaningful medical and mathematical questions. We develop the MapperPlus pipeline for patient stratification. This is an unsupervised learning pipeline that leverages the mathematical notion of topology to detect clusters within high-dimensional data. It is effective in many settings and we demonstrate, in particular, its efficacy in a precision medicine application.
Suggested Citation
Esha Datta & Aditya Ballal & Javier E López & Leighton T Izu, 2023.
"MapperPlus: Agnostic clustering of high-dimension data for precision medicine,"
PLOS Digital Health, Public Library of Science, vol. 2(8), pages 1-16, August.
Handle:
RePEc:plo:pdig00:0000307
DOI: 10.1371/journal.pdig.0000307
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pdig00:0000307. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: digitalhealth (email available below). General contact details of provider: https://journals.plos.org/digitalhealth .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.