IDEAS home Printed from https://ideas.repec.org/a/gam/jsusta/v17y2025i19p8658-d1758793.html
   My bibliography  Save this article

Rethinking Evaluation Metrics in Hydrological Deep Learning: Insights from Torrent Flow Velocity Prediction

Author

Listed:
  • Walter Chen

    (Department of Civil Engineering, National Taipei University of Technology, Taipei 10608, Taiwan)

  • Kieu Anh Nguyen

    (Department of Civil Engineering, National Taipei University of Technology, Taipei 10608, Taiwan)

  • Bor-Shiun Lin

    (Ultron Technology Engineering Company, Taipei 11072, Taiwan)

Abstract

Accurate estimation of flow velocities in torrents and steep rivers is essential for flood risk assessment, sediment transport analysis, and the sustainable management of water resources. While deep learning models are increasingly applied to such tasks, their evaluation often depends on statistical metrics that may yield conflicting interpretations. The objective of this study is to clarify how different evaluation metrics influence the interpretation of hydrological deep learning models. We analyze two models of flow velocity prediction in a torrential creek in Taiwan. Although the models differ in architecture, the critical distinction lies in the datasets used: the first model was trained on May–June data, whereas the second model incorporated May–August data. Four performance metrics were examined—root mean square error (RMSE), Nash–Sutcliffe efficiency (NSE), Willmott’s index of agreement ( d ), and mean absolute percentage error (MAPE). Quantitatively, the first model attained RMSE = 0.0471 m/s, NSE = 0.519, and MAPE = 7.78%, whereas the second model produced RMSE = 0.0572 m/s, NSE = 0.678, and MAPE = 11.56%. The results reveal a paradox. The first model achieved lower RMSE and MAPE, indicating predictions closer to the observed values, but its NSE fell below the 0.65 threshold often cited by reviewers as grounds for rejection. In contrast, the second model exceeded this NSE threshold and would likely be considered acceptable, despite producing larger errors in absolute terms. This paradox highlights the novelty of the study: model evaluation outcomes can be driven more by data variability and the choice of metric than by model architecture. This underscores the risk of misinterpretation if a single metric is used in isolation. For sustainability-oriented hydrology, robust assessment requires reporting multiple metrics and interpreting them in a balanced manner to support disaster risk reduction, resilient water management, and climate adaptation.

Suggested Citation

  • Walter Chen & Kieu Anh Nguyen & Bor-Shiun Lin, 2025. "Rethinking Evaluation Metrics in Hydrological Deep Learning: Insights from Torrent Flow Velocity Prediction," Sustainability, MDPI, vol. 17(19), pages 1-16, September.
  • Handle: RePEc:gam:jsusta:v:17:y:2025:i:19:p:8658-:d:1758793
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2071-1050/17/19/8658/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2071-1050/17/19/8658/
    Download Restriction: no
    ---><---

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jsusta:v:17:y:2025:i:19:p:8658-:d:1758793. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.