A Naive Bayes approach for URL classification with supervised feature selection and rejection framework. (15th January 2018)
- Record Type:
- Journal Article
- Title:
- A Naive Bayes approach for URL classification with supervised feature selection and rejection framework. (15th January 2018)
- Main Title:
- A Naive Bayes approach for URL classification with supervised feature selection and rejection framework
- Authors:
- Rajalakshmi, R.
Aravindan, Chandrabose - Abstract:
- Abstract: Web page classification has become a challenging task due to the exponential growth of the World Wide Web. Uniform Resource Locator (URL)‐based web page classification systems play an important role, but high accuracy may not be achievable as URL contains minimal information. Nevertheless, URL‐based classifiers along with rejection framework can be used as a first‐level filter in a multistage classifier, and a costlier feature extraction from contents may be done in later stages. However, noisy and irrelevant features present in URL demand feature selection methods for URL classification. Therefore, we propose a supervised feature selection method by which relevant URL features are identified using statistical methods. We propose a new feature weighting method for a Naive Bayes classifier by embedding the term goodness obtained from the feature selection method. We also propose a rejection framework to the Naive Bayes classifier by using posterior probability for determining the confidence score. The proposed method is evaluated on the Open Directory Project and WebKB data sets. Experimental results show that our method can be an effective first‐level filter. McNemar tests confirm that our approach significantly improves the performance.
- Is Part Of:
- Computational intelligence. Volume 34:Number 1(2018)
- Journal:
- Computational intelligence
- Issue:
- Volume 34:Number 1(2018)
- Issue Display:
- Volume 34, Issue 1 (2018)
- Year:
- 2018
- Volume:
- 34
- Issue:
- 1
- Issue Sort Value:
- 2018-0034-0001-0000
- Page Start:
- 363
- Page End:
- 396
- Publication Date:
- 2018-01-15
- Subjects:
- feature selection -- Naive Bayes classifier -- rejection framework -- URL classification
Artificial intelligence -- Periodicals
Computational linguistics -- Periodicals
006.3 - Journal URLs:
- http://www.blackwellpublishing.com/journal.asp?ref=0824-7935&site=1 ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1111/coin.12158 ↗
- Languages:
- English
- ISSNs:
- 0824-7935
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3390.595000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 5930.xml