Prediction and causal analysis of defects in steel products: Handling nonnegative and highly overdispersed count data. (February 2020)
- Record Type:
- Journal Article
- Title:
- Prediction and causal analysis of defects in steel products: Handling nonnegative and highly overdispersed count data. (February 2020)
- Main Title:
- Prediction and causal analysis of defects in steel products: Handling nonnegative and highly overdispersed count data
- Authors:
- Zhang, Xinmin
Kano, Manabu
Tani, Masahiro
Mori, Junichi
Ise, Junji
Harada, Kohhei - Abstract:
- Abstract: In the steel industry, defects may occur during the manufacturing process. Thus, it is important to predict the occurrence of defects online in steel products and identify the causal variable that may lead to defects. However, the unique characteristics of the observed defect count data, such as nonnegative integers and high overdispersion, have posed some difficulties to the traditional probability models. To deal with this issue, the present work employs random forests to model and analyze the observed defect count data. Random forests are a nonlinear ensemble learning technique, which constructs several regression trees during the training phase and then predicts the output by averaging the predictions of each tree. Unlike the traditional probability models which are based on the specific distribution assumption, random forests are a non-parametric or distribution-free model. Furthermore, random forests can ensure the nonnegativity of the prediction, and thus it is suitable for defect count data modeling. In addition, partial dependence analysis in conjunction with the variable importance measure was used to identify the causal variable. The application results on the real steelmaking process have demonstrated that random forests outperform the PLS, SVR, Poisson, and NB methods in prediction accuracy. And the most influential variables identified by random forests are in line with operator experience. Highlights: Prediction and causal analysis of defects inAbstract: In the steel industry, defects may occur during the manufacturing process. Thus, it is important to predict the occurrence of defects online in steel products and identify the causal variable that may lead to defects. However, the unique characteristics of the observed defect count data, such as nonnegative integers and high overdispersion, have posed some difficulties to the traditional probability models. To deal with this issue, the present work employs random forests to model and analyze the observed defect count data. Random forests are a nonlinear ensemble learning technique, which constructs several regression trees during the training phase and then predicts the output by averaging the predictions of each tree. Unlike the traditional probability models which are based on the specific distribution assumption, random forests are a non-parametric or distribution-free model. Furthermore, random forests can ensure the nonnegativity of the prediction, and thus it is suitable for defect count data modeling. In addition, partial dependence analysis in conjunction with the variable importance measure was used to identify the causal variable. The application results on the real steelmaking process have demonstrated that random forests outperform the PLS, SVR, Poisson, and NB methods in prediction accuracy. And the most influential variables identified by random forests are in line with operator experience. Highlights: Prediction and causal analysis of defects in steel products were investigated. Defect data are characterized by nonnegative integers and high overdispersion. Random forest (RF) outperforms other linear/nonlinear methods in prediction accuracy. The most influential variables identified by RF are in line with operator experience. … (more)
- Is Part Of:
- Control engineering practice. Volume 95(2020)
- Journal:
- Control engineering practice
- Issue:
- Volume 95(2020)
- Issue Display:
- Volume 95, Issue 2020 (2020)
- Year:
- 2020
- Volume:
- 95
- Issue:
- 2020
- Issue Sort Value:
- 2020-0095-2020-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-02
- Subjects:
- Virtual sensing -- Quality prediction -- Causal analysis -- Defects -- Steel plates -- Random forests
Automatic control -- Periodicals
629.89 - Journal URLs:
- http://www.sciencedirect.com/science/journal/09670661 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.conengprac.2019.104258 ↗
- Languages:
- English
- ISSNs:
- 0967-0661
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3462.020000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 12521.xml