Author
Listed:
- Saif Safaa Shakir
(Department of Computer Engineering, Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz 5166616471, Iran)
- Leyli Mohammad Khanli
(Department of Computer Engineering, Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz 5166616471, Iran)
- Hojjat Emami
(Department of Computer Engineering, Faculty of Engineering, University of Bonab, Bonab 5551395133, Iran)
Abstract
Phishing attacks pose significant risks to security, drawing considerable attention from both security professionals and customers. Despite extensive research, the current phishing website detection mechanisms often fail to efficiently diagnose unknown attacks due to their poor performances in the feature selection stage. Many techniques suffer from overfitting when working with huge datasets. To address this issue, we propose a feature selection strategy based on a convolutional graph network, which utilizes a dataset containing both labels and features, along with hyperparameters for a Support Vector Machine (SVM) and a graph neural network (GNN). Our technique consists of three main stages: (1) preprocessing the data by dividing them into testing and training sets, (2) constructing a graph from pairwise feature distances using the Manhattan distance and adding self-loops to nodes, and (3) implementing a GraphSAGE model with node embeddings and training the GNN by updating the node embeddings through message passing from neighbors, calculating the hinge loss, applying the softmax function, and updating weights via backpropagation. Additionally, we compute the neighborhood random walk (NRW) distance using a random walk with restart to create an adjacency matrix that captures the node relationships. The node features are ranked based on gradient significance to select the top k features, and the SVM is trained using the selected features, with the hyperparameters tuned through cross-validation. We evaluated our model on a test set, calculating the performance metrics and validating the effectiveness of the PhishGNN dataset. Our model achieved a precision of 90.78%, an F1-score of 93.79%, a recall of 97%, and an accuracy of 93.53%, outperforming the existing techniques.
Suggested Citation
Saif Safaa Shakir & Leyli Mohammad Khanli & Hojjat Emami, 2025.
"Convolutional Graph Network-Based Feature Extraction to Detect Phishing Attacks,"
Future Internet, MDPI, vol. 17(8), pages 1-19, July.
Handle:
RePEc:gam:jftint:v:17:y:2025:i:8:p:331-:d:1709545
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jftint:v:17:y:2025:i:8:p:331-:d:1709545. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.