An improved transfer adaptive boosting approach for mixed‐project defect prediction. Issue 10 (3rd June 2019)
- Record Type:
- Journal Article
- Title:
- An improved transfer adaptive boosting approach for mixed‐project defect prediction. Issue 10 (3rd June 2019)
- Main Title:
- An improved transfer adaptive boosting approach for mixed‐project defect prediction
- Authors:
- Gong, Lina
Jiang, Shujuan
Jiang, Li - Abstract:
- Abstract: Software defect prediction (SDP) has been a very important research topic in software engineering, since it can provide high‐quality results when given sufficient historical data of the project. Unfortunately, there are not abundant data to bulid the defect prediction model at the beginning of a project. For this scenario, one possible solution is to use data from other projects in the same company. However, using these data practically would get poor performance because of different distributional characteristics among projects. Also, software has more non‐defective instances than defective instances that may cause a significant bias towards defective instances. Considering these two problems, we propose an improved transfer adaptive boosting (ITrAdaBoost) approach for being given a small number of labeled data in the testing project. In our approach, ITrAdaBoost can not only employ the Matthews correlation coefficient (MCC) as the measure instead of accuracy rate but also use the asymmetric misclassification costs for non‐defective and defective instances. Extensive experiments on 18 public projects from four datasets indicate that: (a) our approach significantly outperforms state‐of‐the‐art cross‐project defect prediction (CPDP) approaches, and (b) our approach can obtain comparable prediction performances in contrast with within project prediction results. Consequently, the proposed approach can build an effective prediction model with a small number of labeledAbstract: Software defect prediction (SDP) has been a very important research topic in software engineering, since it can provide high‐quality results when given sufficient historical data of the project. Unfortunately, there are not abundant data to bulid the defect prediction model at the beginning of a project. For this scenario, one possible solution is to use data from other projects in the same company. However, using these data practically would get poor performance because of different distributional characteristics among projects. Also, software has more non‐defective instances than defective instances that may cause a significant bias towards defective instances. Considering these two problems, we propose an improved transfer adaptive boosting (ITrAdaBoost) approach for being given a small number of labeled data in the testing project. In our approach, ITrAdaBoost can not only employ the Matthews correlation coefficient (MCC) as the measure instead of accuracy rate but also use the asymmetric misclassification costs for non‐defective and defective instances. Extensive experiments on 18 public projects from four datasets indicate that: (a) our approach significantly outperforms state‐of‐the‐art cross‐project defect prediction (CPDP) approaches, and (b) our approach can obtain comparable prediction performances in contrast with within project prediction results. Consequently, the proposed approach can build an effective prediction model with a small number of labeled instances for mixed‐project defect prediction (MPDP). Abstract : For mixed‐project defect prediction, improved transfer adaptive boosting approach (ITrAdaBoost) can not only employ the Matthews correlation coefficient (MCC) as the measure instead of accuracy rate, but also use the asymmetric misclassification costs for nondefective and defective instances. Extensive experiments on 18 public projects from four datasets indicate that (a) our approach significantly outperforms state‐of‐the‐art cross‐project defect prediction (CPDP) approaches, and (b) our approach can obtain comparable prediction performances in contrast with within project prediction results. … (more)
- Is Part Of:
- Journal of software. Volume 31:Issue 10(2019)
- Journal:
- Journal of software
- Issue:
- Volume 31:Issue 10(2019)
- Issue Display:
- Volume 31, Issue 10 (2019)
- Year:
- 2019
- Volume:
- 31
- Issue:
- 10
- Issue Sort Value:
- 2019-0031-0010-0000
- Page Start:
- n/a
- Page End:
- n/a
- Publication Date:
- 2019-06-03
- Subjects:
- class imbalance -- cross‐project -- mixed‐project -- software defect prediction -- transfer learning
Software engineering -- Periodicals
Computer software -- Development -- Periodicals
Software maintenance -- Periodicals
005.1 - Journal URLs:
- http://onlinelibrary.wiley.com/journal/10.1002/(ISSN)2047-7481 ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1002/smr.2172 ↗
- Languages:
- English
- ISSNs:
- 2047-7473
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 11923.xml