An efficient feature selection method based on improved elephant herding optimization to classify high‐dimensional biomedical data. Issue 8 (16th May 2022)
- Record Type:
- Journal Article
- Title:
- An efficient feature selection method based on improved elephant herding optimization to classify high‐dimensional biomedical data. Issue 8 (16th May 2022)
- Main Title:
- An efficient feature selection method based on improved elephant herding optimization to classify high‐dimensional biomedical data
- Authors:
- Singh, Harpreet
Singh, Birmohan
Kaur, Manpreet - Abstract:
- Abstract: Machine learning algorithms are widely applied to biomedical data to classify the samples of patients and healthy persons. The high‐dimensional biomedical datasets contain a large number of features to represent a sample. However, such datasets may have redundant, noisy and irrelevant features, influencing machine learning algorithms' classification performance and increasing computation overhead. Therefore, data normalization and feature selection techniques are introduced to reduce the impact of noisy features and accurately identify the patterns in features to improve the predictive accuracy of diagnosis. This work proposes an efficient feature selection and parameter optimization method to classify high‐dimensional biomedical datasets. In the proposed method, a binary version of the improved elephant herding optimization (IEHO) algorithm is introduced to select features and optimize the C and γ parameters of the support vector machine classifier. Further, four variants of the proposed method are presented based on data normalization techniques: Z ‐score normalization (ZN), Pareto‐scaling (PS), tan h ‐based normalization (TN), and variant of tan h ‐based normalization (VTN). The proposed variants reduce the dominance of noisy features and explore the feature space to obtain the optimal feature set that maximizes the classification accuracy and minimizes the time complexity. The performance of the proposed variants is evaluated on 15 high‐dimensional biomedicalAbstract: Machine learning algorithms are widely applied to biomedical data to classify the samples of patients and healthy persons. The high‐dimensional biomedical datasets contain a large number of features to represent a sample. However, such datasets may have redundant, noisy and irrelevant features, influencing machine learning algorithms' classification performance and increasing computation overhead. Therefore, data normalization and feature selection techniques are introduced to reduce the impact of noisy features and accurately identify the patterns in features to improve the predictive accuracy of diagnosis. This work proposes an efficient feature selection and parameter optimization method to classify high‐dimensional biomedical datasets. In the proposed method, a binary version of the improved elephant herding optimization (IEHO) algorithm is introduced to select features and optimize the C and γ parameters of the support vector machine classifier. Further, four variants of the proposed method are presented based on data normalization techniques: Z ‐score normalization (ZN), Pareto‐scaling (PS), tan h ‐based normalization (TN), and variant of tan h ‐based normalization (VTN). The proposed variants reduce the dominance of noisy features and explore the feature space to obtain the optimal feature set that maximizes the classification accuracy and minimizes the time complexity. The performance of the proposed variants is evaluated on 15 high‐dimensional biomedical datasets. Friedman's mean rank test is applied to check the statistical difference between proposed variants. Results show that the proposed Z ‐score normalization‐IEHO (ZN‐IEHO) variant performed significantly better than the other proposed variants for classification accuracy, false‐positive rate and f ‐score metrics. Moreover, the performance of the proposed ZN‐IEHO variant is compared with 18 state‐of‐the‐art feature selection methods. The experimental results expressed the effectiveness of the proposed ZN‐IEHO variant in finding the best combination of features and parameters to classify the biomedical datasets accurately. … (more)
- Is Part Of:
- Expert systems. Volume 39:Issue 8(2022)
- Journal:
- Expert systems
- Issue:
- Volume 39:Issue 8(2022)
- Issue Display:
- Volume 39, Issue 8 (2022)
- Year:
- 2022
- Volume:
- 39
- Issue:
- 8
- Issue Sort Value:
- 2022-0039-0008-0000
- Page Start:
- n/a
- Page End:
- n/a
- Publication Date:
- 2022-05-16
- Subjects:
- biomedical data -- classification -- feature selection -- improved elephant herding optimization
Expert systems (Computer science)
006.33 - Journal URLs:
- http://onlinelibrary.wiley.com/journal/10.1111/(ISSN)1468-0394 ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1111/exsy.13038 ↗
- Languages:
- English
- ISSNs:
- 0266-4720
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3842.004000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 23424.xml