CSSG: A cost‐sensitive stacked generalization approach for software defect prediction. (8th February 2021)
- Record Type:
- Journal Article
- Title:
- CSSG: A cost‐sensitive stacked generalization approach for software defect prediction. (8th February 2021)
- Main Title:
- CSSG: A cost‐sensitive stacked generalization approach for software defect prediction
- Authors:
- Eivazpour, Zeinab
Keyvanpour, Mohammad Reza - Abstract:
- Summary: The prediction of software artifacts on defect‐prone (DP) or non‐defect‐prone (NDP) classes during the testing phase helps minimize software business costs, which is a classification task in software defect prediction (SDP) field. Machine learning methods are helpful for the task, although they face the challenge of data imbalance distribution. The challenge leads to serious misclassification of artifacts, which will disrupt the predictor's performance. The previously developed stacking ensemble methods do not consider the cost issue to handle the class imbalance problem (CIP) over the training dataset in the SDP field. To bridge this research gap, in the cost‐sensitive stacked generalization (CSSG) approach, we try to combine the staking ensemble learning method with cost‐sensitive learning (CSL) since the CSL purpose is to reduce misclassification costs. In the cost‐sensitive stacked generalization (CSSG) approach, logistic regression (LR) and extremely randomized trees classifiers in cases of CSL and cost‐insensitive are used as a final classifier of stacking scheme. To evaluate the performance of CSSG, we use six performance measures. Several experiments are carried out to compare the CSSG with some cost‐sensitive ensemble methods on 15 benchmark datasets with different imbalance levels. The results indicate that the CSSG can be an effective solution to the CIP than other compared methods. Abstract : We developed the CSSG ensemble approach to introducesSummary: The prediction of software artifacts on defect‐prone (DP) or non‐defect‐prone (NDP) classes during the testing phase helps minimize software business costs, which is a classification task in software defect prediction (SDP) field. Machine learning methods are helpful for the task, although they face the challenge of data imbalance distribution. The challenge leads to serious misclassification of artifacts, which will disrupt the predictor's performance. The previously developed stacking ensemble methods do not consider the cost issue to handle the class imbalance problem (CIP) over the training dataset in the SDP field. To bridge this research gap, in the cost‐sensitive stacked generalization (CSSG) approach, we try to combine the staking ensemble learning method with cost‐sensitive learning (CSL) since the CSL purpose is to reduce misclassification costs. In the cost‐sensitive stacked generalization (CSSG) approach, logistic regression (LR) and extremely randomized trees classifiers in cases of CSL and cost‐insensitive are used as a final classifier of stacking scheme. To evaluate the performance of CSSG, we use six performance measures. Several experiments are carried out to compare the CSSG with some cost‐sensitive ensemble methods on 15 benchmark datasets with different imbalance levels. The results indicate that the CSSG can be an effective solution to the CIP than other compared methods. Abstract : We developed the CSSG ensemble approach to introduces cost‐sensitive learning into the Stacking ensemble method that minimizes misclassification costs. The CSSG are compared with some cost‐sensitive ensemble methods on 15 benchmark datasets. The results indicate that the CSSG can be effective solution for the class imbalance problem than other compared methods. … (more)
- Is Part Of:
- Software testing, verification & reliability. Volume 31:Number 5(2021)
- Journal:
- Software testing, verification & reliability
- Issue:
- Volume 31:Number 5(2021)
- Issue Display:
- Volume 31, Issue 5 (2021)
- Year:
- 2021
- Volume:
- 31
- Issue:
- 5
- Issue Sort Value:
- 2021-0031-0005-0000
- Page Start:
- n/a
- Page End:
- n/a
- Publication Date:
- 2021-02-08
- Subjects:
- class imbalance learning -- cost of misclassification -- cost‐sensitive learning -- data imbalance -- ensemble learning -- software defect prediction
Computer software -- Testing -- Periodicals
Computer software -- Verification -- Periodicals
Computer software -- Reliability -- Periodicals
005.14 - Journal URLs:
- http://onlinelibrary.wiley.com/ ↗
- DOI:
- 10.1002/stvr.1761 ↗
- Languages:
- English
- ISSNs:
- 0960-0833
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 8321.457500
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 18259.xml