IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v11y2023i4p904-d1064443.html
   My bibliography  Save this article

Identification and Correction of Grammatical Errors in Ukrainian Texts Based on Machine Learning Technology

Author

Listed:
  • Vasyl Lytvyn

    (Information Systems and Networks Department, Lviv Polytechnic National University, 12 Bandera Str., 79013 Lviv, Ukraine)

  • Petro Pukach

    (Institute of Applied Mathematics and Fundamental Sciences, Lviv Polytechnic National University, 12 Bandera Str., 79013 Lviv, Ukraine)

  • Victoria Vysotska

    (Information Systems and Networks Department, Lviv Polytechnic National University, 12 Bandera Str., 79013 Lviv, Ukraine
    Institute of Computer Science, Osnabrück University, 1 Friedrich-Janssen-Str., 49076 Osnabrück, Germany)

  • Myroslava Vovk

    (Institute of Applied Mathematics and Fundamental Sciences, Lviv Polytechnic National University, 12 Bandera Str., 79013 Lviv, Ukraine)

  • Nataliia Kholodna

    (Information Systems and Networks Department, Lviv Polytechnic National University, 12 Bandera Str., 79013 Lviv, Ukraine)

Abstract

A machine learning model for correcting errors in Ukrainian texts has been developed. It was established that the neural network has the ability to correct simple sentences written in Ukrainian; however, the development of a full-fledged system requires the use of spell-checking using dictionaries and the checking of rules, both simple and those based on the result of parsing dependencies or other features. In order to save computing resources, a pre-trained BERT (Bidirectional Encoder Representations from Transformer) type neural network was used. Such neural networks have half as many parameters as other pre-trained models and show satisfactory results in correcting grammatical and stylistic errors. Among the ready-made neural network models, the pre-trained neural network model mT5 (a multilingual variant of T5 or Text-to-Text Transfer Transformer) showed the best performance according to the BLEU (bilingual evaluation understudy) and METEOR (metric for evaluation of translation with explicit ordering) metrics.

Suggested Citation

  • Vasyl Lytvyn & Petro Pukach & Victoria Vysotska & Myroslava Vovk & Nataliia Kholodna, 2023. "Identification and Correction of Grammatical Errors in Ukrainian Texts Based on Machine Learning Technology," Mathematics, MDPI, vol. 11(4), pages 1-19, February.
  • Handle: RePEc:gam:jmathe:v:11:y:2023:i:4:p:904-:d:1064443
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/11/4/904/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/11/4/904/
    Download Restriction: no
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:11:y:2023:i:4:p:904-:d:1064443. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.