Author
Listed:
- Ramoni Tirimisiyu Amosa
(Department of Computer Science, School of Applied Sciences, Federal Polytechnic Ede, Osun State. Nigeria)
- Ileladewa Adeoye Abiodun
(Department of Computer Science, School of Applied Sciences, Federal Polytechnic Ede, Osun State. Nigeria)
- Olorunlomerue Adam Biodun
(Department of Computer Science, School of Applied Sciences, Federal Polytechnic Ede, Osun State. Nigeria)
- Lawal Moshood Olatunji
(Department of Computer Science, School of Applied Sciences, Federal Polytechnic Ede, Osun State. Nigeri)
- Ugwu Jennifer Ifeoma
(Department of Computer Science, School of Applied Sciences, Federal Polytechnic Ede, Osun State. Nigeria)
Abstract
Malaria remains a significant global health challenge, particularly in tropical and subtropical regions. Traditional methods of malaria prediction rely on historical data and basic statistical analysis, which often lack the accuracy needed for effective disease control. In recent years, machine learning (ML) techniques have emerged as powerful tools for malaria prediction, offering improved accuracy and reliability. This study evaluates the performance of different ML models including Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbors (KNN), and Logistic Regression (LR)—for malaria disease prediction. The dataset used consists of microscopic blood sample images categorized into parasite-infected and uninfected samples. Given the imbalance in the dataset, three data balancing techniques—oversampling, undersampling, and data augmentation—were applied to enhance model performance. A comparative analysis of the models was conducted using key performance metrics, including accuracy, precision, recall, F1-score, and ROC-AUC. The results indicate that Random Forest with undersampling achieved the highest accuracy (79.07%) and ROC-AUC (90.24%), making it the most effective model. While oversampling and data augmentation improved recall, they did not significantly enhance overall performance. SVM and Logistic Regression demonstrated stable performance but lagged behind Random Forest, whereas KNN exhibited high recall (97.50%) but suffered from low accuracy due to excessive false positives. The findings suggest that undersampling, particularly with Random Forest, is the most effective approach for malaria prediction in imbalanced datasets. This study highlights the potential of machine learning in enhancing malaria diagnosis and resource allocation, offering valuable insights for disease control strategies.
Suggested Citation
Ramoni Tirimisiyu Amosa & Ileladewa Adeoye Abiodun & Olorunlomerue Adam Biodun & Lawal Moshood Olatunji & Ugwu Jennifer Ifeoma, 2025.
"Evaluating the Impact of Imbalanced Data on Malaria Prediction Accuracy,"
International Journal of Research and Innovation in Applied Science, International Journal of Research and Innovation in Applied Science (IJRIAS), vol. 10(4), pages 57-65, April.
Handle:
RePEc:bjf:journl:v:10:y:2025:i:4:p:57-65
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bjf:journl:v:10:y:2025:i:4:p:57-65. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Dr. Renu Malsaria (email available below). General contact details of provider: https://rsisinternational.org/journals/ijrias/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.