The impact of indexing approaches on Arabic text classification. (April 2017)
- Record Type:
- Journal Article
- Title:
- The impact of indexing approaches on Arabic text classification. (April 2017)
- Main Title:
- The impact of indexing approaches on Arabic text classification
- Authors:
- Al-Badarneh, Amer
Al-Shawakfa, Emad
Bani-Ismail, Basel
Al-Rababah, Khaleel
Shatnawi, Safwan - Abstract:
- This paper investigates the impact of using different indexing approaches (full-word, stem, and root) when classifying Arabic text. In this study, the naïve Bayes classifier is used to construct the multinomial classification models and is evaluated using stratified k -fold cross-validation ( k ranges from 2 to 10). It is also uses a corpus that consists of 1000 normalized Arabic documents. The results of one experiment in this study show that significant accuracy improvements have occurred when the full-word form is used in most k -folds. Further experiments show that the classifier has achieved the highest accuracy in the eight-fold by using 7/8–1/8 train–test ratio, despite the indexing approach being used. The overall results of this study show that the classifier has achieved the maximum micro-average accuracy 99.36%, either by using the full-word form or the stem form. This proves that the stem is a better choice to use when classifying Arabic text, because it makes the corpus dataset smaller and this will enhance both the processing time and storage utilization, and achieve the highest level of accuracy.
- Is Part Of:
- Journal of information science. Volume 43:Number 2(2017)
- Journal:
- Journal of information science
- Issue:
- Volume 43:Number 2(2017)
- Issue Display:
- Volume 43, Issue 2 (2017)
- Year:
- 2017
- Volume:
- 43
- Issue:
- 2
- Issue Sort Value:
- 2017-0043-0002-0000
- Page Start:
- 159
- Page End:
- 173
- Publication Date:
- 2017-04
- Subjects:
- Bayesian classifier -- cross-validation -- statistical classifier -- text categorization -- text classification -- text retrieval
Information science -- Periodicals
Information science
Periodicals
020.5 - Journal URLs:
- http://jis.sagepub.com/archive/ ↗
http://www.ingenta.com/journals/browse/bks/jis?mode=direct ↗
http://www.uk.sagepub.com/home.nav ↗
http://firstsearch.oclc.org ↗
http://firstsearch.oclc.org/journal=0165-5515;screen=info;ECOIP ↗ - DOI:
- 10.1177/0165551515625030 ↗
- Languages:
- English
- ISSNs:
- 0165-5515
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 10491.xml