Author
Listed:
- Ismaila Muhammed
- Dimitris M Manias
- Dimitris A Goussis
- Haralampos Hatzikirou
Abstract
Biological systems inherently exhibit multi-scale dynamics, making accurate system identification particularly challenging due to the complexity of capturing a wide time scale spectrum. Traditional methods capable of addressing this issue rely on explicit equations, limiting their applicability in cases where only observational data are available. To overcome this limitation, we propose a data-driven framework that integrates the Sparse Identification of Nonlinear Dynamics (SINDy) method, the multi scale analysis algorithm Computational Singular Perturbation (CSP) and neural networks (NNs). This framework allows the partition of the available dataset in subsets characterized by similar dynamics, so that system identification can proceed within these subsets without facing a wide time scale spectrum. Accordingly, when the full dataset does not allow SINDy to identify the proper model, CSP is employed for the generation of subsets of similar dynamics, which are then fed into SINDy. CSP requires the availability of the gradient of the vector field, which is estimated by the NNs. The framework is tested on the Michaelis-Menten model, for which various reduced models in analytic form exist at different parts of the phase space. It is demonstrated that the CSP-based data subsets allow SINDy to identify the proper reduced model in cases where the full dataset does not. In addition, it is demonstrated that the framework succeeds even in the cases where the available data set originates from stochastic versions of the Michaelis-Menten model. This framework is algorithmic, so system identification is not hindered by the dimensions of the dataset.Author summary: Biological systems often evolve across multiple time scales, posing major challenges for constructing accurate models directly from data. Traditional model reduction techniques require explicit equations and thus cannot be applied when only observational data are available. To address this, we developed a data-driven framework that combines Sparse Identification of Nonlinear Dynamics (SINDy), Computational Singular Perturbation (CSP) and neural networks (NNs). Our approach automatically partitions a dataset into subsets characterized by similar dynamics, allowing valid reduced models to be identified in each region. When SINDy fails to recover a global model from the full dataset, CSP -leveraging Jacobian estimates from NNs- successfully isolates dynamical regimes where SINDy can be applied locally. We validated this framework using the Michaelis-Menten biochemical model, which is known to admit multiple reduced models in different regions of the phase space. Our method consistently identified the appropriate reduced dynamics, even when the data originated from stochastic simulations. Because our approach is algorithmic and equation-free, it is scalable to high-dimensional systems and robust to noise, offering a promising solution for data-driven model discovery in complex biological systems.
Suggested Citation
Ismaila Muhammed & Dimitris M Manias & Dimitris A Goussis & Haralampos Hatzikirou, 2025.
"Data-driven identification of biological systems using multi-scale analysis,"
PLOS Computational Biology, Public Library of Science, vol. 21(11), pages 1-22, November.
Handle:
RePEc:plo:pcbi00:1013193
DOI: 10.1371/journal.pcbi.1013193
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1013193. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.