Wide spectrum feature selection (WiSe) for regression model building. (2nd February 2019)
- Record Type:
- Journal Article
- Title:
- Wide spectrum feature selection (WiSe) for regression model building. (2nd February 2019)
- Main Title:
- Wide spectrum feature selection (WiSe) for regression model building
- Authors:
- Rendall, Ricardo
Castillo, Ivan
Schmidt, Alix
Chin, Swee-Teng
Chiang, Leo H.
Reis, Marco - Abstract:
- Highlights: A new two-stage feature selection methodology for high-dimensional regression problems is proposed. It is called wide spectrum feature selection for regression (WiSe). Pearson correlation, Spearman rank correlation, Symmetrical uncertainty, and all their pairwise combinations are considered in the first stage. In the second stage, the first-pass screened features are further selected using regression-dependent approaches. Results confirm the increased predictive accuracy of models built after filtering out irrelevant features. Abstract: Developing predictive models from industrial datasets implies the consideration of many possible predictor variables (features). Using all available features for data-driven modelling is not recommended, as most of them are expected to be irrelevant and their inclusion in the model may compromise robustness and accuracy. In this work, we present, test and compare a new two-stage feature selection method called wide spectrum feature selection for regression (WiSe). In the first stage, a combination of efficient bivariate filters analyzes linear and non-linear association patterns between predictors and responses, screening out clearly noisy features. In the second stage, the reduced set of retained features is subject to further selection in the scope of the predictive methods considered, optimizing their predictive performance. Three simulated datasets and an industrial case illustrate the effectiveness and benefits of applyingHighlights: A new two-stage feature selection methodology for high-dimensional regression problems is proposed. It is called wide spectrum feature selection for regression (WiSe). Pearson correlation, Spearman rank correlation, Symmetrical uncertainty, and all their pairwise combinations are considered in the first stage. In the second stage, the first-pass screened features are further selected using regression-dependent approaches. Results confirm the increased predictive accuracy of models built after filtering out irrelevant features. Abstract: Developing predictive models from industrial datasets implies the consideration of many possible predictor variables (features). Using all available features for data-driven modelling is not recommended, as most of them are expected to be irrelevant and their inclusion in the model may compromise robustness and accuracy. In this work, we present, test and compare a new two-stage feature selection method called wide spectrum feature selection for regression (WiSe). In the first stage, a combination of efficient bivariate filters analyzes linear and non-linear association patterns between predictors and responses, screening out clearly noisy features. In the second stage, the reduced set of retained features is subject to further selection in the scope of the predictive methods considered, optimizing their predictive performance. Three simulated datasets and an industrial case illustrate the effectiveness and benefits of applying WiSe to support model development in a wide range of high-dimensional regression problems. … (more)
- Is Part Of:
- Computers & chemical engineering. Volume 121(2019)
- Journal:
- Computers & chemical engineering
- Issue:
- Volume 121(2019)
- Issue Display:
- Volume 121, Issue 2019 (2019)
- Year:
- 2019
- Volume:
- 121
- Issue:
- 2019
- Issue Sort Value:
- 2019-0121-2019-0000
- Page Start:
- 99
- Page End:
- 110
- Publication Date:
- 2019-02-02
- Subjects:
- Feature selection -- Filtering methods -- Predictive analytics -- Effect sparsity -- Symmetrical uncertainty
Chemical engineering -- Data processing -- Periodicals
660.0285 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00981354 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compchemeng.2018.10.005 ↗
- Languages:
- English
- ISSNs:
- 0098-1354
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.664000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 9620.xml