IDEAS home Printed from https://ideas.repec.org/a/wsi/jikmxx/v22y2023i04ns0219649223500284.html
   My bibliography  Save this article

A Hybrid Convolutional Bi-Directional Gated Recurrent Unit System for Spoken Languages of JK and Ladakhi

Author

Listed:
  • Irshad Ahmad Thukroo

    (Department of Computer Science, Islamic University of Science & Technology, 1-University Avenue, Awantipora, Pulwama 192122, Jammu and Kashmir, India)

  • Rumaan Bashir

    (Department of Computer Science, Islamic University of Science & Technology, 1-University Avenue, Awantipora, Pulwama 192122, Jammu and Kashmir, India)

  • Kaiser J. Giri

    (Department of Computer Science, Islamic University of Science & Technology, 1-University Avenue, Awantipora, Pulwama 192122, Jammu and Kashmir, India)

Abstract

Spoken language identification is the process of recognising language in an audio segment and is the precursor for several technologies such as automatic call routing, language recognition, multilingual conversation, language parsing, and sentimental analysis. Language identification has become a challenging task for low-resource languages like Kashmiri and Ladakhi spoken in the UT’s of Jammu and Kashmir (JK) and Ladakh, India. This is mainly due to speaker variations like duration, moderator, and ambiance particularly when training and testing are done on different datasets whilst analysing the accuracy of language identification system in actual implementation, thus producing low accuracy results. In order to tackle this problem, we propose a hybrid convolutional bi-directional gated recurrent unit (Bi-GRU) utilising the effects of both static and dynamic behaviour of the audio signal in order to achieve better results as compared to state-of-the-art models. The audio signals are first converted into two-dimensional structures called Mel-spectrograms to represent the frequency distribution over time. To investigate the spectral behaviour of audio signals, we employ a convolutional neural network (CNN) that perceives Mel-spectrograms in multiple dimensions. The CNN-learned feature vector serves as input to the Bi-GRU that maintains the dynamic behaviour of the audio signal. Experiments are done on six spoken languages, i.e. Ladakhi, Kashmiri, Hindi, Urdu, English, and Dogri. The data corpora used for experimentation are the International Institute of Information Technology Hyderabad-Indian Language Speech Corpus (IIITH-ILSC) and the self-created data corpus for the Ladakhi language. The model is tested on two datasets, i.e. speaker-dependent and speaker-independent. Results show that when validating the efficiency of our proposed model on both speaker-dependent and speaker-independent datasets, we achieve optimal accuracies of 99% and 91%, respectively, thus achieving promising results in comparison to the state-of-the-art models available.

Suggested Citation

  • Irshad Ahmad Thukroo & Rumaan Bashir & Kaiser J. Giri, 2023. "A Hybrid Convolutional Bi-Directional Gated Recurrent Unit System for Spoken Languages of JK and Ladakhi," Journal of Information & Knowledge Management (JIKM), World Scientific Publishing Co. Pte. Ltd., vol. 22(04), pages 1-23, August.
  • Handle: RePEc:wsi:jikmxx:v:22:y:2023:i:04:n:s0219649223500284
    DOI: 10.1142/S0219649223500284
    as

    Download full text from publisher

    File URL: http://www.worldscientific.com/doi/abs/10.1142/S0219649223500284
    Download Restriction: Access to full text is restricted to subscribers

    File URL: https://libkey.io/10.1142/S0219649223500284?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wsi:jikmxx:v:22:y:2023:i:04:n:s0219649223500284. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Tai Tone Lim (email available below). General contact details of provider: http://www.worldscinet.com/jikm/jikm.shtml .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.