Binary Bat Algorithm for text feature selection in news events detection model using Markov clustering. Issue 1 (31st December 2022)
- Record Type:
- Journal Article
- Title:
- Binary Bat Algorithm for text feature selection in news events detection model using Markov clustering. Issue 1 (31st December 2022)
- Main Title:
- Binary Bat Algorithm for text feature selection in news events detection model using Markov clustering
- Authors:
- Al-Dyani, Wafa Zubair
Ahmad, Farzana Kabir
Kamaruddin, Siti Sakira - Editors:
- Ventura, Sebastián
- Abstract:
- Abstract: Feature Selection (FS) phase is crucial in the Event Detection (ED) model. Several studies have captured the most informative features using various filter and wrapper FS methods. Recently, FS methods based on swarm intelligence algorithms have been employed to determine the relevant features. Nevertheless, ED from sparse and high-dimensional feature space resulting from a massive number of news documents with different text lengths is a challenging task. Such feature space consists of redundant, irrelevant, and noisy data, which misguide the detection process and substantially, affect the reliability of the ED model. Hence, this study proposes a novel Binary Bat Algorithm (BBA) and Markov Clustering Algorithm (MCL) to improve the performance of the ED model. To the best of our knowledge, BBA is employed for the first time in this study in the context of the ED field. The proposed method is tested on 10 benchmark datasets and 2 primary Facebook news datasets using the average of several evaluation metrics such as F -measure ( F ), Precision ( PR ), Recall ( R ), and Selected Feature Ratio ( SFR ). Comparative experiments against the basic MCL, Binary versions of the Genetic Algorithm and Particle Swarm Optimization are implemented in this study. The empirical results proved that BBA-MCL outperforms other methods on most datasets based on F and PR metrics. Furthermore, the statistical results confirmed that the BBA-MCL FS method has significantly enhanced MCLAbstract: Feature Selection (FS) phase is crucial in the Event Detection (ED) model. Several studies have captured the most informative features using various filter and wrapper FS methods. Recently, FS methods based on swarm intelligence algorithms have been employed to determine the relevant features. Nevertheless, ED from sparse and high-dimensional feature space resulting from a massive number of news documents with different text lengths is a challenging task. Such feature space consists of redundant, irrelevant, and noisy data, which misguide the detection process and substantially, affect the reliability of the ED model. Hence, this study proposes a novel Binary Bat Algorithm (BBA) and Markov Clustering Algorithm (MCL) to improve the performance of the ED model. To the best of our knowledge, BBA is employed for the first time in this study in the context of the ED field. The proposed method is tested on 10 benchmark datasets and 2 primary Facebook news datasets using the average of several evaluation metrics such as F -measure ( F ), Precision ( PR ), Recall ( R ), and Selected Feature Ratio ( SFR ). Comparative experiments against the basic MCL, Binary versions of the Genetic Algorithm and Particle Swarm Optimization are implemented in this study. The empirical results proved that BBA-MCL outperforms other methods on most datasets based on F and PR metrics. Furthermore, the statistical results confirmed that the BBA-MCL FS method has significantly enhanced MCL performance with p -value = 0.003, by generating the most informative features. Ultimately, this work concludes that BBA-MCL obtains significant features and effectively detects real-world events from heterogeneous news text documents. … (more)
- Is Part Of:
- Cogent engineering. Volume 9:Issue 1(2022)
- Journal:
- Cogent engineering
- Issue:
- Volume 9:Issue 1(2022)
- Issue Display:
- Volume 9, Issue 1 (2022)
- Year:
- 2022
- Volume:
- 9
- Issue:
- 1
- Issue Sort Value:
- 2022-0009-0001-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-12-31
- Subjects:
- Binary Bat Algorithm (BBA) -- Markov Clustering (MCL) -- Feature Selection (FS) -- Event Detection (ED) -- wrapper mehthods -- heterogeneous news -- text clustering
Engineering -- Periodicals
Technology -- Periodicals
Engineering
Technology
Periodicals
620 - Journal URLs:
- http://bibpurl.oclc.org/web/73324 ↗
http://cogentoa.tandfonline.com/journal/oaen20 ↗
http://www.tandfonline.com/toc/oaen20/1/1 ↗
http://www.tandfonline.com/ ↗
http://cogentoa.tandfonline.com/journal/oaps20 ↗ - DOI:
- 10.1080/23311916.2021.2010923 ↗
- Languages:
- English
- ISSNs:
- 2331-1916
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 26174.xml