IDEAS home Printed from https://ideas.repec.org/a/gam/jftint/v18y2026i2p102-d1865141.html

Image-Based Malware Classification Using DCGAN-Augmented Data and a CNN–Transformer Hybrid Model

Author

Listed:
  • Manya Dhingra

    (Department of Information Technology, Bharati Vidyapeeth’s College of Engineering, New Delhi 110063, India)

  • Achin Jain

    (Department of Information Technology, Bharati Vidyapeeth’s College of Engineering, New Delhi 110063, India)

  • Niharika Thakur

    (Department of Electronics and Communication Engineering, Manav Rachna University, Faridabad 121004, India)

  • Anurag Choubey

    (Department of Computer Science and Engineering, Technocrats Institute of Technology, Anandnagar, Bhopal 462022, India)

  • Massimo Donelli

    (Department of Civil, Environmental and Mechanical Engineering, University of Trento, 38123 Trento, Italy)

  • Arun Kumar Dubey

    (Department of Information Technology, Bharati Vidyapeeth’s College of Engineering, New Delhi 110063, India)

  • Arvind Panwar

    (School of Computer Science and Engineering, Galgotias University, Greater Noida 201308, India)

Abstract

With the rapid growth and diversification of malware, accurate multi-class detection remains challenging due to severe class imbalance and limited labeled data. This work presents an image-based malware classification framework that converts executable binaries into 64 × 64 grayscale images, employs class-wise DCGAN augmentation to mitigate severe imbalance (initial imbalance ratio >12 across 31 families, N ≈ 9300 ), and trains a hybrid CNN–Transformer model that captures both local texture features and long-range contextual dependencies. The DCGAN generator produces high-fidelity synthetic samples, evaluated using Inception Score (IS) = 3.43 , Fréchet Inception Distance (FID) = 10.99 , and Kernel Inception Distance (KID) = 0.0022 , and is used to equalize class counts before classifier training. On the blended dataset the proposed GAN-balanced CNN–Transformer achieves an overall accuracy of 95% and a macro-averaged F1-score of 0.95; the hybrid model also attains validation accuracy of ≈94% while substantially improving minority-class recognition. Compared to CNN-only and Transformer-only baselines, the hybrid approach yields more stable convergence, reduced overfitting, and stronger per-class performance, while remaining feasible for practical deployment. These results demonstrate that DCGAN-driven balancing combined with CNN–Transformer feature fusion is an effective, scalable solution for robust malware family classification.

Suggested Citation

  • Manya Dhingra & Achin Jain & Niharika Thakur & Anurag Choubey & Massimo Donelli & Arun Kumar Dubey & Arvind Panwar, 2026. "Image-Based Malware Classification Using DCGAN-Augmented Data and a CNN–Transformer Hybrid Model," Future Internet, MDPI, vol. 18(2), pages 1-23, February.
  • Handle: RePEc:gam:jftint:v:18:y:2026:i:2:p:102-:d:1865141
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/1999-5903/18/2/102/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/1999-5903/18/2/102/
    Download Restriction: no
    ---><---

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jftint:v:18:y:2026:i:2:p:102-:d:1865141. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.