Contra-KD: A Lightweight Transformer Model for Malicious URL Detection with Contrastive Representation and Model Distillation

Contra-KD: A Lightweight Transformer Model for Malicious URL Detection with Contrastive Representation and Model Distillation

Author

Listed:

Zheng You Lim
(Centre for Advanced Analytics, CoE for Artificial Intelligence, Multimedia University, Jalan Ayer Keroh Lama, Bukit Beruang, Melaka 75450, Malaysia)
Ying Han Pang
(Centre for Advanced Analytics, CoE for Artificial Intelligence, Multimedia University, Jalan Ayer Keroh Lama, Bukit Beruang, Melaka 75450, Malaysia
Faculty of Information Science and Technology, Multimedia University, Jalan Ayer Keroh Lama, Bukit Beruang, Melaka 75450, Malaysia)
Edwin Chan Kah Jun
(Faculty of Information Science and Technology, Multimedia University, Jalan Ayer Keroh Lama, Bukit Beruang, Melaka 75450, Malaysia)
Shih Yin Ooi
(Centre for Advanced Analytics, CoE for Artificial Intelligence, Multimedia University, Jalan Ayer Keroh Lama, Bukit Beruang, Melaka 75450, Malaysia
Faculty of Information Science and Technology, Multimedia University, Jalan Ayer Keroh Lama, Bukit Beruang, Melaka 75450, Malaysia)
Goh Fan Ling
(FINEXT Sdn Bhd, B-23A-7, Vertical Business Suite Avenue 3 Bangsar South City, No 8, Jalan Kerinchi, Kuala Lumpur 59200, Malaysia)

Abstract

Infected URLs are always regarded as a serious threat to cybersecurity, serving as pathways to phishing, maliciousness, and other offenses. Although transformer-based models have demonstrated good performance in malicious URL detection, their high computational cost and latency make them impractical for deployment in real-time or resource-constrained systems. Allocated on the basis of knowledge distillation (KD), lightweight models tend to be efficient but are commonly not sufficiently discriminative to distinguish between malicious and benign URLs with non-cataclysmic lexical overlaps, particularly when dealing with an imbalanced dataset. In order to address these issues, we propose Contra-KD, a lightweight transformer model that incorporates contrastive learning (CL) and KD. This proposed framework imposes structured embedding matching, allowing the student model to learn more meaningful and generalized depictions. Contra-KD uses a compact 6-layer student transformer architecture based on ELECTRA to scale parameters up and can achieve more than 90% computational fidelity with a high accuracy. In this scheme, CL improves the feature of discrimination by semantically clustering similar URLs and separating different URLs. This tendency serves to limit confusion, especially when a common lexical trait is held between two words and/or in the presence of adversarial obfuscation. Through a large-scale publicly available Kaggle dataset of 651,191 URLs in imbalanced scenarios, the proposed Contra-KD can achieve 99.05% accuracy, 99.96% ROC-AUC, and 98.18% MCC which are superior to their counterparts including lightweight models and transformer-based ones. To summarize, Contra-KD proposes an efficient transformer architecture that is both small and effective in computation while delivering stable detection performance.

Suggested Citation

Zheng You Lim & Ying Han Pang & Edwin Chan Kah Jun & Shih Yin Ooi & Goh Fan Ling, 2026. "Contra-KD: A Lightweight Transformer Model for Malicious URL Detection with Contrastive Representation and Model Distillation," Future Internet, MDPI, vol. 18(3), pages 1-20, March.

Handle: RePEc:gam:jftint:v:18:y:2026:i:3:p:157-:d:1896771

Download full text from publisher

More about this item

Keywords

; ; ; ; ; ;

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jftint:v:18:y:2026:i:3:p:157-:d:1896771. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

We have no bibliographic references for this item. You can help adding them by using this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Contra-KD: A Lightweight Transformer Model for Malicious URL Detection with Contrastive Representation and Model Distillation

Author

Abstract

Suggested Citation

Download full text from publisher

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data