A novel combined dynamic ensemble selection model for imbalanced data to detect COVID-19 from complete blood count. (November 2021)
- Record Type:
- Journal Article
- Title:
- A novel combined dynamic ensemble selection model for imbalanced data to detect COVID-19 from complete blood count. (November 2021)
- Main Title:
- A novel combined dynamic ensemble selection model for imbalanced data to detect COVID-19 from complete blood count
- Authors:
- Wu, Jiachao
Shen, Jiang
Xu, Man
Shao, Minglai - Abstract:
- Highlights: A combined dynamic ensemble selection (DES) method is proposed for imbalanced data. The combined DES method is applied to detect COVID-19 from complete blood count. Use SMOTE-ENN to balance data and remove noise. Hybrid multiple clustering and bagging to generate cadidate classifiers for DES. Proposed combined DES method outperforms some advanced algorithms. Abstract: Background: As blood testing is radiation-free, low-cost and simple to operate, some researchers use machine learning to detect COVID-19 from blood test data. However, few studies take into consideration the imbalanced data distribution, which can impair the performance of a classifier. Method: A novel combined dynamic ensemble selection (DES) method is proposed for imbalanced data to detect COVID-19 from complete blood count. This method combines data preprocessing and improved DES. Firstly, we use the hybrid synthetic minority over-sampling technique and edited nearest neighbor (SMOTE-ENN) to balance data and remove noise. Secondly, in order to improve the performance of DES, a novel hybrid multiple clustering and bagging classifier generation (HMCBCG) method is proposed to reinforce the diversity and local regional competence of candidate classifiers. Results: The experimental results based on three popular DES methods show that the performance of HMCBCG is better than only use bagging. HMCBCG+KNE obtains the best performance for COVID-19 screening with 99.81% accuracy, 99.86% F1, 99.78% G-meanHighlights: A combined dynamic ensemble selection (DES) method is proposed for imbalanced data. The combined DES method is applied to detect COVID-19 from complete blood count. Use SMOTE-ENN to balance data and remove noise. Hybrid multiple clustering and bagging to generate cadidate classifiers for DES. Proposed combined DES method outperforms some advanced algorithms. Abstract: Background: As blood testing is radiation-free, low-cost and simple to operate, some researchers use machine learning to detect COVID-19 from blood test data. However, few studies take into consideration the imbalanced data distribution, which can impair the performance of a classifier. Method: A novel combined dynamic ensemble selection (DES) method is proposed for imbalanced data to detect COVID-19 from complete blood count. This method combines data preprocessing and improved DES. Firstly, we use the hybrid synthetic minority over-sampling technique and edited nearest neighbor (SMOTE-ENN) to balance data and remove noise. Secondly, in order to improve the performance of DES, a novel hybrid multiple clustering and bagging classifier generation (HMCBCG) method is proposed to reinforce the diversity and local regional competence of candidate classifiers. Results: The experimental results based on three popular DES methods show that the performance of HMCBCG is better than only use bagging. HMCBCG+KNE obtains the best performance for COVID-19 screening with 99.81% accuracy, 99.86% F1, 99.78% G-mean and 99.81% AUC. Conclusion: Compared to other advanced methods, our combined DES model can improve accuracy, G-mean, F1 and AUC of COVID-19 screening. … (more)
- Is Part Of:
- Computer methods and programs in biomedicine. Volume 211(2021)
- Journal:
- Computer methods and programs in biomedicine
- Issue:
- Volume 211(2021)
- Issue Display:
- Volume 211, Issue 2021 (2021)
- Year:
- 2021
- Volume:
- 211
- Issue:
- 2021
- Issue Sort Value:
- 2021-0211-2021-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-11
- Subjects:
- COVID-19 screening -- Imbalanced data -- Dynamic ensemble selection -- Hybrid multiple clustering and bagging -- Candidate classifier generation
Medicine -- Computer programs -- Periodicals
Biology -- Computer programs -- Periodicals
Computers -- Periodicals
Medicine -- Periodicals
Médecine -- Logiciels -- Périodiques
Biologie -- Logiciels -- Périodiques
Biology -- Computer programs
Medicine -- Computer programs
Periodicals
Electronic journals
610.28 - Journal URLs:
- http://www.sciencedirect.com/science/journal/01692607 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.cmpb.2021.106444 ↗
- Languages:
- English
- ISSNs:
- 0169-2607
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.095000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 20051.xml