A comparative study between PCR, PLSR, and LW-PLS on the predictive performance at different data splitting ratios. Issue 11 (2nd November 2022)
- Record Type:
- Journal Article
- Title:
- A comparative study between PCR, PLSR, and LW-PLS on the predictive performance at different data splitting ratios. Issue 11 (2nd November 2022)
- Main Title:
- A comparative study between PCR, PLSR, and LW-PLS on the predictive performance at different data splitting ratios
- Authors:
- Thien, Teck Fu
Yeo, Wan Sieng - Abstract:
- Abstract: Principal component regression (PCR), partial least squares regression (PLSR), and locally weighted partial least squares (LW-PLS) models are supervised learning methods in which a labeled dataset is used to train the model. The split-sample validation is normally used to train these models where a dataset is split into training and testing datasets to develop and evaluate the model. However, a limited study is done to evaluate the prediction performance of PCR, PLSR, and LW-PLS models at the different data splitting ratios. Hence, to address this research gap, this submitted work is conducted to investigate the predictive performance of the abovementioned regression models at the different split sample ratios for the data. Meanwhile, this study also serves to determine the optimal splitting ratios for PCR, PLSR, and LW-PLS models via a simple data splitting method where a minimum of 50% of the entire dataset is allocated to train the model. The optimal split is determined by evaluating the root mean squared error, coefficient of determination, and error of approximation ( Ea ) for five case studies. For PCR, PLSR, and LW-PLS models, LW-PLS performed better in most of the case studies since it copes better with the nonlinear data. Among these best models in each case study, it was found that the split-sample ratios of above 70% of training data had allowed major improvements in terms of predictive performance as compared to their base scenarios which have theAbstract: Principal component regression (PCR), partial least squares regression (PLSR), and locally weighted partial least squares (LW-PLS) models are supervised learning methods in which a labeled dataset is used to train the model. The split-sample validation is normally used to train these models where a dataset is split into training and testing datasets to develop and evaluate the model. However, a limited study is done to evaluate the prediction performance of PCR, PLSR, and LW-PLS models at the different data splitting ratios. Hence, to address this research gap, this submitted work is conducted to investigate the predictive performance of the abovementioned regression models at the different split sample ratios for the data. Meanwhile, this study also serves to determine the optimal splitting ratios for PCR, PLSR, and LW-PLS models via a simple data splitting method where a minimum of 50% of the entire dataset is allocated to train the model. The optimal split is determined by evaluating the root mean squared error, coefficient of determination, and error of approximation ( Ea ) for five case studies. For PCR, PLSR, and LW-PLS models, LW-PLS performed better in most of the case studies since it copes better with the nonlinear data. Among these best models in each case study, it was found that the split-sample ratios of above 70% of training data had allowed major improvements in terms of predictive performance as compared to their base scenarios which have the largest Ea values. … (more)
- Is Part Of:
- Chemical engineering communications. Volume 209:Issue 11(2022)
- Journal:
- Chemical engineering communications
- Issue:
- Volume 209:Issue 11(2022)
- Issue Display:
- Volume 209, Issue 11 (2022)
- Year:
- 2022
- Volume:
- 209
- Issue:
- 11
- Issue Sort Value:
- 2022-0209-0011-0000
- Page Start:
- 1439
- Page End:
- 1456
- Publication Date:
- 2022-11-02
- Subjects:
- Data splitting -- locally weighted partial least square regression -- partial least square regression -- prediction -- principal component regression -- soft sensors
Chemical engineering -- Periodicals
660.205 - Journal URLs:
- http://www.tandfonline.com/toc/gcec20/current ↗
http://www.tandfonline.com/ ↗ - DOI:
- 10.1080/00986445.2021.1957853 ↗
- Languages:
- English
- ISSNs:
- 0098-6445
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3143.030000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 23907.xml