Author
Listed:
- Nabilah Afrin
- Masud An-Nur Islam Fahim
- Wasan Alamro
- Yazan M Allawi
- Ahmad Abadleh
- Salman Md Sultan
- Ersin Elbasi
- Aymen I Zreikat
Abstract
Medical image classification requires models that effectively capture both fine-grained local patterns and global anatomical structures while maintaining computational efficiency for clinical deployment. Although state-of-the-art models such as MedMamba utilize State-Space Models (SSMs) to balance accuracy and efficiency, their sequential operations limit parallelism and increase runtime. To overcome these limitations, we propose MedSpectralNet, a lightweight Convolutional Neural Network (CNN) architecture that approximates self-attention with linear complexity to efficiently extract multi-frequency features. The model introduces a dual-stream feature extractor that processes global and local information in parallel, and a ContextGate block that adaptively fuses multi-scale representations. MedSpectralNet is evaluated across six benchmark datasets from MedMNIST (including BloodMNIST, BreastMNIST, DermaMNIST, PneumoniaMNIST, OrganCMNIST, and OrganSMNIST), MedSpectralNet achieves an average accuracy of 93.7% on OrganCMNIST and 98.0% on BloodMNIST, showing 1–4.3% relative accuracy gains when compared to larger transformer-based models. Importantly, it delivers this performance with only 8.5 million parameters, representing approximately 60% fewer parameters than MedMamba-T, which requires 14.5 million parameters. MedSpectralNet has also achieved high AUC values up to 0.999 across multiple classes, demonstrating state-of-the-art accuracy with substantially reduced computational cost and improved parallelization, which makes MedSpectralNet well-suited for real-time and resource-constrained classification-based medical applications.
Suggested Citation
Nabilah Afrin & Masud An-Nur Islam Fahim & Wasan Alamro & Yazan M Allawi & Ahmad Abadleh & Salman Md Sultan & Ersin Elbasi & Aymen I Zreikat, 2026.
"MedSpectralNet: A lightweight convolutional neural network architecture for multi-modal image classification,"
PLOS ONE, Public Library of Science, vol. 21(4), pages 1-25, April.
Handle:
RePEc:plo:pone00:0346128
DOI: 10.1371/journal.pone.0346128
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0346128. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.