International Research and Academic scholar society

Phishing Website Detection through Machine Learning Algorithms: A Comparative Analysis


Sr No:
Page No: 11-29
Language: English
Authors: Ochuko Piserchia*
Received: 2025-10-06
Accepted: 2025-11-29
Published Date: 2025-12-08
GoogleScholar: Click here
Abstract:
Phishing is the attempt to acquire sensitive information, often for malicious reasons, by masking as a trustworthy entity in an electronic communication. Once victims access a phishing website, the attacker attempts to convince them to send their private information such as usernames, passwords and credit card resulting in information theft. Despite the growing awareness of phishing and its prevention through traditional methods such as DNS filtering, blacklisting, and user awareness trainings regarding the problem and its associated risks, it remains as growing concern, costing millions of dollars each year. The only effective defense against these threats is accurate detection of phishing attempts. However, machine learning methods have shown reasonable performance rates. Machine learning techniques which are a subset of artificial learning (AI) have shown significant success in detecting phishing websites in comparison to traditional methods, although effectiveness can vary depending on the approach deployed. This research aimed to solve this problem by analyzing a phishing website dataset with six supervised algorithms. This was achieved using a feature selection investigation on the most promising of the 6 algorithms using primarily the filter method and compared with outcome of wrapper method. In addition to Accuracy and ROC (Receiver Operating Characteristic) Curve performance metrics, we also considered MCC (Matthews Correlation Coefficient). The experiment showed that Random Forest is the best performing algorithm at 0.989 MCC score (97% accuracy). We also realized 5 of the 30 features are enough for the classification with little or no reduction in performance.
Keywords: Phishing Detection, Machine Learning, Comparative Analysis, Random Forest, Feature Selection, Cybersecurity.

Journal: IRASS Journal of Multidisciplinary Studies
ISSN(Online): 3049-0073
Publisher: IRASS Publisher
Frequency: Monthly
Language: English

Phishing Website Detection through Machine Learning Algorithms: A Comparative Analysis