TY - JOUR
T1 - Context-aware embeddings for robust multiclass fraudulent URL detection in online social platforms
AU - Afzal, Sara
AU - Asim, Muhammad
AU - Beg, Mirza Omer
AU - Baker, Thar
AU - Awad, Ali Ismail
AU - Shamim, Nouman
N1 - Publisher Copyright:
© 2024 The Author(s)
PY - 2024/10
Y1 - 2024/10
N2 - The current ubiquity of online social networks (OSNs) cannot be overstated, and they have over 4.8 billion users worldwide. These platforms have become integrated into modern life, representing an important means of communication and information sharing. However, this widespread popularity has also drawn the attention of cybercriminals, who seek to exploit OSNs using deceptive Uniform Resource Locators (URLs) as their weapons of choice. Conventional URL-classification methods, which rely on post-access features or static analysis, face significant limitations; they struggle to keep pace with the ever-evolving tactics of cybercriminals, and they often lack the granularity required for precise URL categorization. The methodology proposed herein takes a different path, leveraging the power of an artificial neural network (ANN) in tandem with Bidirectional Encoder Representations from Transformers (BERT) to extract contextual embeddings from URLs. By combining the cutting-edge capabilities of ANNs and BERT, we introduce an efficient approach to safeguarding OSN users from the insidious threats lurking behind deceptive URLs by classifying them into five distinct categories: benign, defacement, phishing, malware, and spam. The proposed approach was found to achieve an impressive accuracy rate of 98.0%, surpassing the previous best of 97.92%. This technique thus has the potential to serve as a crucial defense mechanism for the billions of individuals who rely on OSNs for their social and informational needs.
AB - The current ubiquity of online social networks (OSNs) cannot be overstated, and they have over 4.8 billion users worldwide. These platforms have become integrated into modern life, representing an important means of communication and information sharing. However, this widespread popularity has also drawn the attention of cybercriminals, who seek to exploit OSNs using deceptive Uniform Resource Locators (URLs) as their weapons of choice. Conventional URL-classification methods, which rely on post-access features or static analysis, face significant limitations; they struggle to keep pace with the ever-evolving tactics of cybercriminals, and they often lack the granularity required for precise URL categorization. The methodology proposed herein takes a different path, leveraging the power of an artificial neural network (ANN) in tandem with Bidirectional Encoder Representations from Transformers (BERT) to extract contextual embeddings from URLs. By combining the cutting-edge capabilities of ANNs and BERT, we introduce an efficient approach to safeguarding OSN users from the insidious threats lurking behind deceptive URLs by classifying them into five distinct categories: benign, defacement, phishing, malware, and spam. The proposed approach was found to achieve an impressive accuracy rate of 98.0%, surpassing the previous best of 97.92%. This technique thus has the potential to serve as a crucial defense mechanism for the billions of individuals who rely on OSNs for their social and informational needs.
KW - Artificial neural networks
KW - BERT embeddings
KW - Securing online social networks
KW - URL classification
UR - http://www.scopus.com/inward/record.url?scp=85200010949&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85200010949&partnerID=8YFLogxK
U2 - 10.1016/j.compeleceng.2024.109494
DO - 10.1016/j.compeleceng.2024.109494
M3 - Article
AN - SCOPUS:85200010949
SN - 0045-7906
VL - 119
JO - Computers and Electrical Engineering
JF - Computers and Electrical Engineering
M1 - 109494
ER -