Automatic ensemble feature selection using fast non-dominated sorting. Issue 100 (September 2021)
- Record Type:
- Journal Article
- Title:
- Automatic ensemble feature selection using fast non-dominated sorting. Issue 100 (September 2021)
- Main Title:
- Automatic ensemble feature selection using fast non-dominated sorting
- Authors:
- Abasabadi, Sedighe
Nematzadeh, Hossein
Motameni, Homayun
Akbari, Ebrahim - Abstract:
- Abstract: Feature selection refers to selecting optimal feature subset with effective data preprocessing policy in making high dimensional data for diverse pattern recognition problems. The aims of feature selection are enhancing accuracy, improving the evaluation performance, and finding the smallest effective feature subset. In this study, ensemble feature selection method is adopted based on an assumption indicating that a combination of several feature selection methods obtains more robust results than any individual feature selection method. Accordingly, when carrying out ensemble feature selection, a combinational method should be used to combine rankings of features from diverse algorithms into an individual rank for each feature. It is also required to set a threshold to acquire a functional subset of features. In this work, a three-step ensemble feature selection technique called Automatic Thresholding Feature Selection (ATFS) is proposed. The first step involves diversity generation where multiple rankers are applied to each dataset to generate different feature rankings. Second, output rankings of individual selectors are combined using fast non-dominated sorting that is a combinational method empowering the proposed ensemble with automatic thresholding capability. Third, feature sets are generated to obtain the optimal feature set. Additionally, a new filter method called Sorted Label Interference (SLI) is proposed based on interference between class labels. BothAbstract: Feature selection refers to selecting optimal feature subset with effective data preprocessing policy in making high dimensional data for diverse pattern recognition problems. The aims of feature selection are enhancing accuracy, improving the evaluation performance, and finding the smallest effective feature subset. In this study, ensemble feature selection method is adopted based on an assumption indicating that a combination of several feature selection methods obtains more robust results than any individual feature selection method. Accordingly, when carrying out ensemble feature selection, a combinational method should be used to combine rankings of features from diverse algorithms into an individual rank for each feature. It is also required to set a threshold to acquire a functional subset of features. In this work, a three-step ensemble feature selection technique called Automatic Thresholding Feature Selection (ATFS) is proposed. The first step involves diversity generation where multiple rankers are applied to each dataset to generate different feature rankings. Second, output rankings of individual selectors are combined using fast non-dominated sorting that is a combinational method empowering the proposed ensemble with automatic thresholding capability. Third, feature sets are generated to obtain the optimal feature set. Additionally, a new filter method called Sorted Label Interference (SLI) is proposed based on interference between class labels. Both SLI and ATFS are applicable to binary datasets. The performance of SLI and ATFS is at least comparable and often better than the performance of individual rankers and existing ensemble methods. The obtained results also show that the use of ATFS-generated threshold improves not only the performance of ATFS and SLI, but also the performance of other filters and combinational methods. Highlights: We proposed ATFS, an ensemble feature selection method, which uses non-dominated sorting as combination method. ATFS finds the threshold value automatically and selects informative features for each dataset. We proposed SLI, a new filter method based on interference between class labels, with capability of dealing with high dimensional datasets. … (more)
- Is Part Of:
- Information systems. Issue 100(2021)
- Journal:
- Information systems
- Issue:
- Issue 100(2021)
- Issue Display:
- Volume 100, Issue 100 (2021)
- Year:
- 2021
- Volume:
- 100
- Issue:
- 100
- Issue Sort Value:
- 2021-0100-0100-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-09
- Subjects:
- Ensemble feature selection -- Fast non-dominated sorting -- Automatic threshold
Database management -- Periodicals
Electronic data processing -- Periodicals
Bases de données -- Gestion -- Périodiques
Informatique -- Périodiques
Database management
Electronic data processing
Periodicals
005.7 - Journal URLs:
- http://www.sciencedirect.com/science/journal/03064379 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.is.2021.101760 ↗
- Languages:
- English
- ISSNs:
- 0306-4379
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4496.367300
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 17090.xml