A graph-theoretic approach for the detection of phishing webpages. Issue 95 (August 2020)
- Record Type:
- Journal Article
- Title:
- A graph-theoretic approach for the detection of phishing webpages. Issue 95 (August 2020)
- Main Title:
- A graph-theoretic approach for the detection of phishing webpages
- Authors:
- Tan, Choon Lin
Chiew, Kang Leng
Yong, Kelvin S.C.
Sze, San Nah
Abdullah, Johari
Sebastian, Yakub - Abstract:
- Highlights: Leverage the website hyperlink structure to discover new inherent phishing patterns. Enhance phishing detection rate using novel features derived from the web graph. This is the first work that fully utilises graph features for phishing detection. Graph features attain accuracies up to 97.8% and outperform conventional features. Achieve immutability and robustness against evolving phishing schemes. Abstract: Over the years, various technical means have been developed to protect Internet users from phishing attacks. To enrich the anti-phishing efforts, we capitalise on concepts from graph theories, and propose a set of novel graph features to improve the phishing detection accuracy. The initial phase of the proposed technique involved the extraction of hyperlinks in the webpage under scrutiny and fetching the corresponding neighbourhood webpages. During this process, the page linking data were collected, and used to construct a web graph which models the overall hyperlink and network structure of the webpage. From the web graph, graph measures were computed and extracted as graph features to derive a classifier for detecting phishing webpages. Experimental results show that the proposed graph features achieve an improved overall accuracy of 97.8% when C4.5 was utilised as classifier, outperforming the existing conventional features derived from the same data samples. Unlike conventional features, the proposed graph features leverage inherent phishing patterns thatHighlights: Leverage the website hyperlink structure to discover new inherent phishing patterns. Enhance phishing detection rate using novel features derived from the web graph. This is the first work that fully utilises graph features for phishing detection. Graph features attain accuracies up to 97.8% and outperform conventional features. Achieve immutability and robustness against evolving phishing schemes. Abstract: Over the years, various technical means have been developed to protect Internet users from phishing attacks. To enrich the anti-phishing efforts, we capitalise on concepts from graph theories, and propose a set of novel graph features to improve the phishing detection accuracy. The initial phase of the proposed technique involved the extraction of hyperlinks in the webpage under scrutiny and fetching the corresponding neighbourhood webpages. During this process, the page linking data were collected, and used to construct a web graph which models the overall hyperlink and network structure of the webpage. From the web graph, graph measures were computed and extracted as graph features to derive a classifier for detecting phishing webpages. Experimental results show that the proposed graph features achieve an improved overall accuracy of 97.8% when C4.5 was utilised as classifier, outperforming the existing conventional features derived from the same data samples. Unlike conventional features, the proposed graph features leverage inherent phishing patterns that are only visible at a higher level of abstraction, thus making it robust and difficult to be evaded by direct manipulations on the webpage contents. Our proposed graph-based technique also shows promising results when benchmarked against a prominent phishing detection technique. Hence, the proposed technique is an important contribution to the existing anti-phishing research towards improving the detection performance. … (more)
- Is Part Of:
- Computers & security. Issue 95(2020)
- Journal:
- Computers & security
- Issue:
- Issue 95(2020)
- Issue Display:
- Volume 95, Issue 95 (2020)
- Year:
- 2020
- Volume:
- 95
- Issue:
- 95
- Issue Sort Value:
- 2020-0095-0095-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-08
- Subjects:
- Phishing detection -- Hyperlinks -- Web graph -- Graph features -- Machine learning
Computer security -- Periodicals
Electronic data processing departments -- Security measures -- Periodicals
005.805 - Journal URLs:
- http://www.sciencedirect.com/science/journal/01674048 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.cose.2020.101793 ↗
- Languages:
- English
- ISSNs:
- 0167-4048
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.781000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 13518.xml