IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v10y2022i11p1942-d832476.html
   My bibliography  Save this article

A Two-Step Data Normalization Approach for Improving Classification Accuracy in the Medical Diagnosis Domain

Author

Listed:
  • Ivan Izonin

    (Department of Artificial Intelligence, Lviv Polytechnic National University, 79013 Lviv, Ukraine)

  • Roman Tkachenko

    (Department of Publishing Information Technologies, Lviv Polytechnic National University, 79013 Lviv, Ukraine)

  • Nataliya Shakhovska

    (Department of Artificial Intelligence, Lviv Polytechnic National University, 79013 Lviv, Ukraine)

  • Bohdan Ilchyshyn

    (Department of Artificial Intelligence, Lviv Polytechnic National University, 79013 Lviv, Ukraine)

  • Krishna Kant Singh

    (Department of Computer Science and Engineering, Jain (Deemed to Be University), Bangalore 560069, India)

Abstract

Data normalization is a data preprocessing task and one of the first to be performed during intellectual analysis, particularly in the case of tabular data. The importance of its implementation is determined by the need to reduce the sensitivity of the artificial intelligence model to the values of the features in the dataset to increase the studied model’s adequacy. This paper focuses on the problem of effectively preprocessing data to improve the accuracy of intellectual analysis in the case of performing medical diagnostic tasks. We developed a new two-step method for data normalization of numerical medical datasets. It is based on the possibility of considering both the interdependencies between the features of each observation from the dataset and their absolute values to improve the accuracy when performing medical data mining tasks. We describe and substantiate each step of the algorithmic implementation of the method. We also visualize the results of the proposed method. The proposed method was modeled using six different machine learning methods based on decision trees when performing binary and multiclass classification tasks. We used six real-world, freely available medical datasets with different numbers of vectors, attributes, and classes to conduct experiments. A comparison between the effectiveness of the developed method and that of five existing data normalization methods was carried out. It was experimentally established that the developed method increases the accuracy of the Decision Tree and Extra Trees Classifier by 1–5% in the case of performing the binary classification task and the accuracy of the Bagging, Decision Tree, and Extra Trees Classifier by 1–6% in the case of performing the multiclass classification task. Increasing the accuracy of these classifiers only by using the new data normalization method satisfies all the prerequisites for its application in practice when performing various medical data mining tasks.

Suggested Citation

  • Ivan Izonin & Roman Tkachenko & Nataliya Shakhovska & Bohdan Ilchyshyn & Krishna Kant Singh, 2022. "A Two-Step Data Normalization Approach for Improving Classification Accuracy in the Medical Diagnosis Domain," Mathematics, MDPI, vol. 10(11), pages 1-18, June.
  • Handle: RePEc:gam:jmathe:v:10:y:2022:i:11:p:1942-:d:832476
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/10/11/1942/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/10/11/1942/
    Download Restriction: no
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Gaeithry Manoharam & Mohd Shareduwan Mohd Kasihmuddin & Siti Noor Farwina Mohamad Anwar Antony & Nurul Atiqah Romli & Nur ‘Afifah Rusdi & Suad Abdeen & Mohd. Asyraf Mansor, 2023. "Log-Linear-Based Logic Mining with Multi-Discrete Hopfield Neural Network," Mathematics, MDPI, vol. 11(9), pages 1-30, April.
    2. Samuka Mohanty & Rajashree Dash, 2023. "A New Dual Normalization for Enhancing the Bitcoin Pricing Capability of an Optimized Low Complexity Neural Net with TOPSIS Evaluation," Mathematics, MDPI, vol. 11(5), pages 1-28, February.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:10:y:2022:i:11:p:1942-:d:832476. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.