IDEAS home Printed from https://ideas.repec.org/a/gam/jsusta/v15y2022i1p522-d1017837.html
   My bibliography  Save this article

Machine Learning for Water Quality Assessment Based on Macrophyte Presence

Author

Listed:
  • Ivana Krtolica

    (The Institute for Artificial Intelligence Research and Development of Serbia, Fruškogorska 1, 21000 Novi Sad, Serbia)

  • Dragan Savić

    (KWR Water Research Institute, Groningenhaven 7, 3433 PE Nieuwegein, The Netherlands
    Centre for Water Systems, College of Engineering, Mathematics and Physical Sciences, University of Exeter, Exeter EX4 4QF, UK)

  • Bojana Bajić

    (The Institute for Artificial Intelligence Research and Development of Serbia, Fruškogorska 1, 21000 Novi Sad, Serbia
    Faculty of Technical Sciences, University of Novi Sad, Trg Dositeja Obradovića 3, 21000 Novi Sad, Serbia)

  • Snežana Radulović

    (Faculty of Sciences, University of Novi Sad, Trg Dositeja Obradovića 3, 21000 Novi Sad, Serbia)

Abstract

The ecological state of the Danube River, as the world’s most international river basin, will always be the focus of scientists in the field of ecology and environmental engineering. The concentration of orthophosphate anions in the river is one of the main indicators of the ecological state, i.e., water quality and level of eutrophication. The sedentary nature and ability to survive in river sections, combined with the presence of high levels of orthophosphate anions, make macrophytes an appropriate biological parameter for in situ prediction of in-river monitoring processes. However, a preliminary literature review identified a lack of comprehensive analysis that can enable the prediction of the ecological state of rivers using biological parameters as the input to machine learning (ML) techniques. This work focuses on comparing eight state-of-the-art ML classification models developed for this task. The data were collected at 68 sampling sites on both river sides. The predictive models use macrophyte presence scores as input variables, and classes of the ecological state of the Danube River based on orthophosphate anions, converted into a binary scale, as outputs. The results of the predictive model comparisons show that support vector machines and tree-based models provided the best prediction capabilities. They are also a low-cost and sustainable solution to assess the ecological state of the rivers.

Suggested Citation

  • Ivana Krtolica & Dragan Savić & Bojana Bajić & Snežana Radulović, 2022. "Machine Learning for Water Quality Assessment Based on Macrophyte Presence," Sustainability, MDPI, vol. 15(1), pages 1-13, December.
  • Handle: RePEc:gam:jsusta:v:15:y:2022:i:1:p:522-:d:1017837
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2071-1050/15/1/522/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2071-1050/15/1/522/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Xu, Ping & Brock, Guy N. & Parrish, Rudolph S., 2009. "Modified linear discriminant analysis approaches for classification of high-dimensional microarray data," Computational Statistics & Data Analysis, Elsevier, vol. 53(5), pages 1674-1687, March.
    2. Priyanka & Dharmender Kumar, 2020. "Decision tree classifier: a detailed survey," International Journal of Information and Decision Sciences, Inderscience Enterprises Ltd, vol. 12(3), pages 246-269.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Frénay, Benoît & Doquire, Gauthier & Verleysen, Michel, 2014. "Estimating mutual information for feature selection in the presence of label noise," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 832-848.
    2. Parrish, Rudolph S. & Spencer III, Horace J. & Xu, Ping, 2009. "Distribution modeling and simulation of gene expression data," Computational Statistics & Data Analysis, Elsevier, vol. 53(5), pages 1650-1660, March.
    3. Gourlay, Sydney & Kilic, Talip & Martuscelli, Antonio & Wollburg, Philip & Zezza, Alberto, 2021. "Viewpoint: High-frequency phone surveys on COVID-19: Good practices, open questions," Food Policy, Elsevier, vol. 105(C).
    4. A. Poterie & J.-F. Dupuy & V. Monbet & L. Rouvière, 2019. "Classification tree algorithm for grouped variables," Computational Statistics, Springer, vol. 34(4), pages 1613-1648, December.
    5. Brendan P. W. Ames & Mingyi Hong, 2016. "Alternating direction method of multipliers for penalized zero-variance discriminant analysis," Computational Optimization and Applications, Springer, vol. 64(3), pages 725-754, July.
    6. Malki, Zohair & Atlam, El-Sayed & Hassanien, Aboul Ella & Dagnew, Guesh & Elhosseini, Mostafa A. & Gad, Ibrahim, 2020. "Association between weather data and COVID-19 pandemic predicting mortality rate: Machine learning approaches," Chaos, Solitons & Fractals, Elsevier, vol. 138(C).
    7. Muhammed-Fatih Kaya, 2022. "Pattern Labelling of Business Communication Data," Group Decision and Negotiation, Springer, vol. 31(6), pages 1203-1234, December.
    8. Kubokawa, Tatsuya & Hyodo, Masashi & Srivastava, Muni S., 2013. "Asymptotic expansion and estimation of EPMC for linear classification rules in high dimension," Journal of Multivariate Analysis, Elsevier, vol. 115(C), pages 496-515.
    9. Irina Gaynanova & James G. Booth & Martin T. Wells, 2016. "Simultaneous Sparse Estimation of Canonical Vectors in the ≫ Setting," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(514), pages 696-706, April.
    10. Yichao Xie & Bowen Zhou & Zhenyu Wang & Bo Yang & Liaoyi Ning & Yanhui Zhang, 2023. "Industrial Carbon Footprint (ICF) Calculation Approach Based on Bayesian Cross-Validation Improved Cyclic Stacking," Sustainability, MDPI, vol. 15(19), pages 1-35, September.
    11. Sung, Bongjung & Lee, Jaeyong, 2023. "Covariance structure estimation with Laplace approximation," Journal of Multivariate Analysis, Elsevier, vol. 198(C).
    12. Michael Fop & Pierre-Alexandre Mattei & Charles Bouveyron & Thomas Brendan Murphy, 2022. "Unobserved classes and extra variables in high-dimensional discriminant analysis," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 16(1), pages 55-92, March.
    13. Pires, Ana M. & Branco, João A., 2010. "Projection-pursuit approach to robust linear discriminant analysis," Journal of Multivariate Analysis, Elsevier, vol. 101(10), pages 2464-2485, November.
    14. Pedro Duarte Silva, A., 2011. "Two-group classification with high-dimensional correlated data: A factor model approach," Computational Statistics & Data Analysis, Elsevier, vol. 55(11), pages 2975-2990, November.
    15. Ruiyan Luo & Xin Qi, 2017. "Asymptotic Optimality of Sparse Linear Discriminant Analysis with Arbitrary Number of Classes," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 44(3), pages 598-616, September.
    16. Shen, Yanfeng & Lin, Zhengyan & Zhu, Jun, 2011. "Shrinkage-based regularization tests for high-dimensional data with application to gene set analysis," Computational Statistics & Data Analysis, Elsevier, vol. 55(7), pages 2221-2233, July.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jsusta:v:15:y:2022:i:1:p:522-:d:1017837. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.