A guided oversampling technique to improve the prediction of software fault-proneness for imbalanced data. Issue 2 (1st January 2012)
- Record Type:
- Journal Article
- Title:
- A guided oversampling technique to improve the prediction of software fault-proneness for imbalanced data. Issue 2 (1st January 2012)
- Main Title:
- A guided oversampling technique to improve the prediction of software fault-proneness for imbalanced data
- Authors:
- Shatnawi, Raed
Al-Sharif, Ziad - Abstract:
- Fault-proneness is one of the most tackled quality factors in the field of software quality. Predicting the probability of the faulty classes is necessary information to guide developers in their endeavour to improve the software quality and to reduce the costs of testing and maintenance. The performance of the fault prediction models suffers greatly from the imbalance of fault distribution, i.e., the majority of modules are not faulty whereas the minority are only faulty. The imbalanced distribution of faults affects the efficiency of prediction models greatly. In this paper, we discuss many oversampling techniques that are used to improve the performance of prediction models. We propose to guide the oversampling process using the fault content (i.e., the number of faults in a module). This study is conducted on a large object-oriented system – Eclipse. The proposed oversampling is tested on ten classifiers. The results of this work shows that using fault content in sampling has better prediction performance than other traditional oversampling techniques. The decision trees and nearest neighbours have shown outstanding performance whereas other classifiers have shown acceptable performance.
- Is Part Of:
- International journal of knowledge engineering and data mining. Volume 2:Issue 2/3(2012)
- Journal:
- International journal of knowledge engineering and data mining
- Issue:
- Volume 2:Issue 2/3(2012)
- Issue Display:
- Volume 2, Issue 2/3 (2012)
- Year:
- 2012
- Volume:
- 2
- Issue:
- 2/3
- Issue Sort Value:
- 2012-0002-NaN-0000
- Page Start:
- 200
- Page End:
- 214
- Publication Date:
- 2012-01-01
- Subjects:
- fault-proneness -- imbalanced data -- CK metrics -- data mining -- oversampling
Knowledge representation (Information theory) -- Periodicals
Data mining -- Periodicals
006.305 - Journal URLs:
- http://www.inderscience.com/browse/index.php?journalCODE=ijkedm ↗
http://www.inderscience.com/ ↗ - Languages:
- English
- ISSNs:
- 1755-2087
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 8730.xml