Software change‐proneness prediction through combination of bagging and resampling methods. Issue 12 (12th October 2018)
- Record Type:
- Journal Article
- Title:
- Software change‐proneness prediction through combination of bagging and resampling methods. Issue 12 (12th October 2018)
- Main Title:
- Software change‐proneness prediction through combination of bagging and resampling methods
- Authors:
- Zhu, Xiaoyan
He, Yueyang
Cheng, Long
Jia, Xiaolin
Zhu, Lei - Abstract:
- Abstract: Identifying the change‐prone parts of software could help managers and developers to effectively allocate maintenance resource and time during early phases of software life cycle. Change‐proneness prediction on file level with binary classification methods makes such identification possible. As the fact that change‐prone files frequently account for a small part of all the files, the prediction performance of standard classification methods is not satisfying. In this paper, we employ imbalanced learning methods, including bagging, resampling, and especially their combination to reduce the performance decrease of standard classifiers caused by the class imbalance problem in change‐proneness prediction. Besides, we propose a boxplot‐based partition method to provide more proper change‐proneness label designation for the training data. Eight open‐source Java projects are chosen in the empirical study to validate the effectiveness of the combination methods in change‐proneness prediction. The experimental results of the empirical study show that combining bagging with resampling can significantly improve the prediction performance of only bagging or resampling. Of all the combination methods employed, combination of bagging with undersampling performs better than others. And support vector machine is more effective as a base classifier than J48 and naive Bayes. Abstract : This paper employs imbalanced learning methods, including bagging, resampling, and especiallyAbstract: Identifying the change‐prone parts of software could help managers and developers to effectively allocate maintenance resource and time during early phases of software life cycle. Change‐proneness prediction on file level with binary classification methods makes such identification possible. As the fact that change‐prone files frequently account for a small part of all the files, the prediction performance of standard classification methods is not satisfying. In this paper, we employ imbalanced learning methods, including bagging, resampling, and especially their combination to reduce the performance decrease of standard classifiers caused by the class imbalance problem in change‐proneness prediction. Besides, we propose a boxplot‐based partition method to provide more proper change‐proneness label designation for the training data. Eight open‐source Java projects are chosen in the empirical study to validate the effectiveness of the combination methods in change‐proneness prediction. The experimental results of the empirical study show that combining bagging with resampling can significantly improve the prediction performance of only bagging or resampling. Of all the combination methods employed, combination of bagging with undersampling performs better than others. And support vector machine is more effective as a base classifier than J48 and naive Bayes. Abstract : This paper employs imbalanced learning methods, including bagging, resampling, and especially their combination to reduce the performance decrease of standard classifiers caused by the class imbalance problem in change‐proneness prediction. Besides, a boxplot‐based partition method is proposed to provide more proper change‐proneness label designation for the training data. … (more)
- Is Part Of:
- Journal of software. Volume 30:Issue 12(2018)
- Journal:
- Journal of software
- Issue:
- Volume 30:Issue 12(2018)
- Issue Display:
- Volume 30, Issue 12 (2018)
- Year:
- 2018
- Volume:
- 30
- Issue:
- 12
- Issue Sort Value:
- 2018-0030-0012-0000
- Page Start:
- n/a
- Page End:
- n/a
- Publication Date:
- 2018-10-12
- Subjects:
- bagging -- change‐proneness prediction -- imbalanced data -- resampling
Software engineering -- Periodicals
Computer software -- Development -- Periodicals
Software maintenance -- Periodicals
005.1 - Journal URLs:
- http://onlinelibrary.wiley.com/journal/10.1002/(ISSN)2047-7481 ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1002/smr.2111 ↗
- Languages:
- English
- ISSNs:
- 2047-7473
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 9132.xml