Author
Listed:
- Yannis Schumann
(Deutsches Elektronen-Synchrotron DESY)
- Simon Schlumbohm
(Helmut-Schmidt-University Hamburg)
- Julia E. Neumann
(University Medical Center Hamburg-Eppendorf (UKE)
University Medical Center Hamburg-Eppendorf (UKE))
- Philipp Neumann
(Deutsches Elektronen-Synchrotron DESY
University of Hamburg)
Abstract
Data from high-throughput technologies assessing global patterns of biomolecules (omic data), is often afflicted with missing values and with measurement-specific biases (batch-effects), that hinder the quantitative comparison of independently acquired datasets. This work introduces batch-effect reduction trees (BERT), a high-performance method for data integration of incomplete omic profiles. We characterize BERT on large-scale data integration tasks with up to 5000 datasets from simulated and experimental data of different quantification techniques and omic types (proteomics, transcriptomics, metabolomics) as well as other datatypes e.g., clinical data, emphasizing the broad scope of the algorithm. Compared to the only available method for integration of incomplete omic data, HarmonizR, our method (1) retains up to five orders of magnitude more numeric values, (2) leverages multi-core and distributed-memory systems for up to 11 × runtime improvement (3) considers covariates and reference measurements to account for severely imbalanced or sparsely distributed conditions (up to 2 × improvement of average-silhouette-width).
Suggested Citation
Yannis Schumann & Simon Schlumbohm & Julia E. Neumann & Philipp Neumann, 2025.
"High performance data integration for large-scale analyses of incomplete Omic profiles using Batch-Effect Reduction Trees (BERT),"
Nature Communications, Nature, vol. 16(1), pages 1-13, December.
Handle:
RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-62237-4
DOI: 10.1038/s41467-025-62237-4
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-62237-4. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.