IDEAS home Printed from https://ideas.repec.org/a/gam/jsusta/v15y2023i4p3204-d1063504.html
   My bibliography  Save this article

Voice Pathology Detection Using a Two-Level Classifier Based on Combined CNN–RNN Architecture

Author

Listed:
  • Amel Ksibi

    (Department of Information Systems, College of Computer and Information Science, Princess Nourah bint Abdulrahman University, Riyadh 11671, Saudi Arabia)

  • Nada Ali Hakami

    (Computer Science Department, College of Computer Science and Information Technology, Jazan University, Jazan 45142, Saudi Arabia)

  • Nazik Alturki

    (Department of Information Systems, College of Computer and Information Science, Princess Nourah bint Abdulrahman University, Riyadh 11671, Saudi Arabia)

  • Mashael M. Asiri

    (Department of Computer Science, College of Science & Art at Mahayil, King Khalid University, Abha 62529, Saudi Arabia)

  • Mohammed Zakariah

    (College of Computer and Information Sciences, King Saud University, Riyadh 11543, Saudi Arabia)

  • Manel Ayadi

    (Department of Information Systems, College of Computer and Information Science, Princess Nourah bint Abdulrahman University, Riyadh 11671, Saudi Arabia)

Abstract

The construction of an automatic voice pathology detection system employing machine learning algorithms to study voice abnormalities is crucial for the early detection of voice pathologies and identifying the specific type of pathology from which patients suffer. This paper’s primary objective is to construct a deep learning model for accurate speech pathology identification. Manual audio feature extraction was employed as a foundation for the categorization process. Incorporating an additional piece of information, i.e., voice gender, via a two-level classifier model was the most critical aspect of this work. The first level determines whether the audio input is a male or female voice, and the second level determines whether the agent is pathological or healthy. Similar to the bulk of earlier efforts, the current study analyzed the audio signal by focusing solely on a single vowel, such as /a/, and ignoring phrases and other vowels. The analysis was performed on the Saarbruecken Voice Database,. The two-level cascaded model attained an accuracy and F1 score of 88.84% and 87.39%, respectively, which was superior to earlier attempts on the same dataset and provides a steppingstone towards a more precise early diagnosis of voice complications.

Suggested Citation

  • Amel Ksibi & Nada Ali Hakami & Nazik Alturki & Mashael M. Asiri & Mohammed Zakariah & Manel Ayadi, 2023. "Voice Pathology Detection Using a Two-Level Classifier Based on Combined CNN–RNN Architecture," Sustainability, MDPI, vol. 15(4), pages 1-18, February.
  • Handle: RePEc:gam:jsusta:v:15:y:2023:i:4:p:3204-:d:1063504
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2071-1050/15/4/3204/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2071-1050/15/4/3204/
    Download Restriction: no
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jsusta:v:15:y:2023:i:4:p:3204-:d:1063504. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.