Author
Listed:
- Alaa Aldein M. S. Ibrahim
(Discipline of Electrical, Electronic and Computer Engineering, University of KwaZulu-Natal, Durban 4041, South Africa)
- Mfanasibili Nkonyane
(Umngeni-Uthukela Water, Pietermaritzburg 3201, South Africa)
- Mlondi Ngcobo
(Umngeni-Uthukela Water, Pietermaritzburg 3201, South Africa)
- Tom Walingo
(Discipline of Electrical, Electronic and Computer Engineering, University of KwaZulu-Natal, Durban 4041, South Africa)
- Jules-Raymond Tapamo
(Discipline of Electrical, Electronic and Computer Engineering, University of KwaZulu-Natal, Durban 4041, South Africa)
Abstract
Accurate assessment of water quality is crucial for protecting public health and promoting environmental sustainability. Conventional laboratory-based methods for evaluating microbial contaminants are often time-consuming, resource-intensive, and reactive in nature, limiting their effectiveness for real-time water quality monitoring and management. This study examines the application of data-driven machine learning models to predict E. coli concentrations in Midmar Dam, utilizing readily available physicochemical parameters. A comparative analysis was conducted using five classical standalone ML algorithms: Random Forest (RF), Support Vector Machine (SVM), k-Nearest Neighbors (kNN), Artificial Neural Network (ANN), and Extreme Gradient Boosting (XGBoost). These models were assessed based on their predictive performance using standard error metrics, including Mean Absolute Error (MAE), Mean Squared Error (MSE), and Root Mean Squared Error (RMSE). Among the models evaluated, the kNN algorithm demonstrated superior performance, achieving the lowest MSE and RMSE values, thereby highlighting its effectiveness in capturing the complex relationships between physicochemical indicators and microbial contamination levels. The findings demonstrate the potential of ML-based approaches to serve as efficient, scalable, and proactive tools for sustainable water-quality monitoring and management in dams.
Suggested Citation
Alaa Aldein M. S. Ibrahim & Mfanasibili Nkonyane & Mlondi Ngcobo & Tom Walingo & Jules-Raymond Tapamo, 2025.
"Data-Driven Machine Learning Models for E. coli Concentration Prediction,"
Sustainability, MDPI, vol. 18(1), pages 1-20, December.
Handle:
RePEc:gam:jsusta:v:18:y:2025:i:1:p:179-:d:1825062
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jsusta:v:18:y:2025:i:1:p:179-:d:1825062. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager The email address of this maintainer does not seem to be valid anymore. Please ask MDPI Indexing Manager to update the entry or send us the correct address
(email available below). General contact details of provider: https://www.mdpi.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.