IDEAS home Printed from https://ideas.repec.org/a/wsi/fracta/v33y2025i02ns0218348x25400213.html
   My bibliography  Save this article

Multi-Class Automated Speech Language Recognition Using Natural Language Processing With Optimal Deep Learning Model

Author

Listed:
  • REEMA G. AL-ANAZI

    (Department of Arabic Language and Literature, College of Humanities and Social Sciences, Princess Nourah bint Abdulrahman University, P. O. Box 84428, Riyadh 11671, Saudi Arabia)

  • HAMED ALQAHTANI

    (��Department of Information Systems, College of Computer Science, Center of Artificial Intelligence, Unit of Cybersecurity, King Khalid University, Abha, Saudi Arabia)

  • MUHAMMAD SWAILEH A. ALZAIDI

    (��Department of English Language, College of Language Sciences, King Saud University, P. O. Box 145111, Riyadh, Saudi Arabia)

  • MESHARI H. ALANAZI

    (�Department of Computer Science, College of Sciences, Northern Border University, Arar, Saudi Arabia)

  • HANAN AL SULTAN

    (�Department of English, College of Arts, King Faisal University, Ahsaa, Saudi Arabia)

  • AMAL F. ALROWAILY

    (��Department of Family Medicine, King Abdulaziz Medical City, Ministry of National Guard-Health Affairs, Riyadh, Saudi Arabia)

  • JAWHARA ALJABRI

    (*Department of Computer Science, University College in Umluj, University of Tabuk, Tabuk, Saudi Arabia)

  • ASSAL ALQUDAH

    (��†Department of Computer Science, AlZaytoonah University of Jordan, Amman, Jordan)

Abstract

With technological development, human–computer interaction (HCI) has improved, and spoken communication among machines and humans is one solution to enhance and expedite this process. Researchers have recently explored several systems to improve speech and speaker recognition performance in recent decades. A crucial threat in HCI is developing models that can effectually listen and respond like humans. It resulted in the development of the automated speech emotion recognition (SER) method, which can recognize various emotional classes by electing and extracting effectual features from speech signals. The fundamental problem of automated speech detection is the considerable variation in speech signals because of distinct speakers, language differences, speech differences, contents and acoustic conditions, voice modulation differences based on age and gender. With enhancements in deep learning (DL) and the affordability of computational resources, specifically graphical processing units (GPUs), research underwent a paradigm shift. Therefore, this study develops a multi-class automated speech language recognition using natural language processing with optimal deep learning (MASLR-NLPODL) technique. The MASLR-NLPODL technique intends to accomplish the efficient identification of different spoken languages. In the MASLR-NLPODL technique, the initial preprocessing technique involves windowing, frame blocking, and pre-emphasis block. Next, an adaptive time-frequency feature extractor approach utilizing the discrete fractional Fourier transform (DFrFT) was applied, which can be attained by extending the discrete Fourier transform (DFT) with eigenvectors. An improved Harris hawks optimization (IHHO) technique can be employed to select effectual features. Moreover, the classification of spoken languages can be performed by the gated recurrent unit (GRU) model. Finally, the salp swarm algorithm (SSA)-based hyperparameter selection process is involved in enhancing the performance of the GRU model. The design of the IHHO-based feature selection and SSA-based hyperparameter tuning process demonstrates the novelty of the work. The performance evaluation of the MASLR-NLPODL technique takes place under the VoxForge Dataset. The experimental validation of the MASLR-NLPODL technique exhibited a superior accuracy outcome of 96.40% over existing techniques.

Suggested Citation

  • Reema G. Al-Anazi & Hamed Alqahtani & Muhammad Swaileh A. Alzaidi & Meshari H. Alanazi & Hanan Al Sultan & Amal F. Alrowaily & Jawhara Aljabri & Assal Alqudah, 2025. "Multi-Class Automated Speech Language Recognition Using Natural Language Processing With Optimal Deep Learning Model," FRACTALS (fractals), World Scientific Publishing Co. Pte. Ltd., vol. 33(02), pages 1-15.
  • Handle: RePEc:wsi:fracta:v:33:y:2025:i:02:n:s0218348x25400213
    DOI: 10.1142/S0218348X25400213
    as

    Download full text from publisher

    File URL: http://www.worldscientific.com/doi/abs/10.1142/S0218348X25400213
    Download Restriction: Access to full text is restricted to subscribers

    File URL: https://libkey.io/10.1142/S0218348X25400213?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wsi:fracta:v:33:y:2025:i:02:n:s0218348x25400213. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Tai Tone Lim (email available below). General contact details of provider: https://www.worldscientific.com/worldscinet/fractals .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.