Author
Listed:
- Joana Filipa Teixeira Fernandes
(Research Centre in Digitalization and Intelligent Robotics (CeDRI), Laboratório para a Sustentabilidade e Tecnologia em Regiões de Montanha (SusTEC), - Instituto Politécnico de Bragança (IPB)
Faculty of Engineering of University of Porto (FEUP))
- João Viana Pinto
(University Hospital Centre of São João, Otorhinolaryngology Department
University of Oporto, Faculty of Medicine, Unit of Otorhinolaryngology, Department of Surgery and Physiology
Centre for Health Technology and Services Research (CINTESIS) Rua Dr. Plácido da Costa)
- Carla Pinto Moura
(University of Porto, Genetics, Faculty of Medicine, Department of Pathology
University Hospital Centre of São JoãoPorto, Department of Otorhinolaryngology)
- Helena Vilarinho
(University Hospital Centre of São JoãoPorto, Department of Otorhinolaryngology
School of Health Sciences (ESSUA), University of Aveiro)
- Felipe Teixeira
(Research Centre in Digitalization and Intelligent Robotics (CeDRI), Laboratório para a Sustentabilidade e Tecnologia em Regiões de Montanha (SusTEC), - Instituto Politécnico de Bragança (IPB)
Applied Management Research Unit (UNIAG) - Instituto Politécnico de Bragança (IPB)
School of Sciences and TechnologyUniversity of Trás-os-Montes and Alto Douro (UTAD) Quinta de Prados, Engineering Department)
- Diamantino Freitas
(Faculty of Engineering of University of Porto (FEUP))
- João Paulo Teixeira
(Research Centre in Digitalization and Intelligent Robotics (CeDRI), Laboratório para a Sustentabilidade e Tecnologia em Regiões de Montanha (SusTEC), - Instituto Politécnico de Bragança (IPB))
Abstract
In this work and the literature, voice signals can be classified as periodic (type 1) or either some periodicity (type 2) and chaos (type 3). This work aims to classify signs into types 1, 2 or 3 to be subsequently applied in a classification system for pathological/control signs. The original dataset is composed of 466 type 1 individuals, 900 type 2 individuals, and 84 type 3 individuals classified by an otolaryngologist. 15% of the data was used for testing and the remaining 85% was used for training and validation. A data augmentation technique was applied to balance the data in training set. Therefore, for the test set, 3380 sounds were used, 1020 type 1, 1280 type 2 and 1080 type 3. Of these, 80% were used for training and 20% for validation. The Mel spectrograms of the signals were used in the input of a VGGish to retrain the model in classifying the 3 types of signals. Regarding test accuracy, this network obtained 71.2%.
Suggested Citation
Joana Filipa Teixeira Fernandes & João Viana Pinto & Carla Pinto Moura & Helena Vilarinho & Felipe Teixeira & Diamantino Freitas & João Paulo Teixeira, 2025.
"Nonperiodic Pathologic Voice Signals Classification Using Mel-Spectrogram and VGGish,"
Springer Proceedings in Business and Economics, in: Pedro Miguel Gaspar & Juan Manuel Cueva Lovelle & Carlos Mentenegro-Marín & Teresa Guarda (ed.), Health Technologies and Demographic Challenges, pages 3-13,
Springer.
Handle:
RePEc:spr:prbchp:978-3-031-94901-2_1
DOI: 10.1007/978-3-031-94901-2_1
Download full text from publisher
To our knowledge, this item is not available for
download. To find whether it is available, there are three
options:
1. Check below whether another version of this item is available online.
2. Check on the provider's
web page
whether it is in fact available.
3. Perform a
for a similarly titled item that would be
available.
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:prbchp:978-3-031-94901-2_1. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.