Using software metrics for predicting vulnerable classes and methods in Java projects: A machine learning approach. Issue 3 (7th August 2020)
- Record Type:
- Journal Article
- Title:
- Using software metrics for predicting vulnerable classes and methods in Java projects: A machine learning approach. Issue 3 (7th August 2020)
- Main Title:
- Using software metrics for predicting vulnerable classes and methods in Java projects: A machine learning approach
- Authors:
- Sultana, Kazi Zakia
Anu, Vaibhav
Chong, Tai‐Yin - Abstract:
- Abstract: [ Context ]A software vulnerability becomes harmful for software when an attacker successfully exploits the insecure code and reveals the vulnerability. A single vulnerability in code can put the entire software at risk. Therefore, maintaining software security throughout the software life cycle is an important and at the same time challenging task for development teams. This can also leave the door open for vulnerable code being evolved during successive releases. In recent years, researchers have used software metrics‐based vulnerability prediction approaches to detect vulnerable code early and ensure secure code releases. Software metrics have been employed to predict vulnerability specifically in C/C++ and Java‐based systems. However, the prediction performance of metrics at different granularity levels (class level or method level) has not been analyzed. In this paper, we focused on metrics that are specific to lower granularity levels (Java classes and methods). Based on statistical analysis, we first identified a set of class‐level metrics and a set of method‐level metrics and then employed them as features in machine learning techniques to predict vulnerable classes and methods, respectively. This paper describes a comparative study on how our selected metrics perform at different granularity levels. Such a comparative study can help the developers in choosing the appropriate metrics (at the desired level of granularity). [ Objective ] The goal of thisAbstract: [ Context ]A software vulnerability becomes harmful for software when an attacker successfully exploits the insecure code and reveals the vulnerability. A single vulnerability in code can put the entire software at risk. Therefore, maintaining software security throughout the software life cycle is an important and at the same time challenging task for development teams. This can also leave the door open for vulnerable code being evolved during successive releases. In recent years, researchers have used software metrics‐based vulnerability prediction approaches to detect vulnerable code early and ensure secure code releases. Software metrics have been employed to predict vulnerability specifically in C/C++ and Java‐based systems. However, the prediction performance of metrics at different granularity levels (class level or method level) has not been analyzed. In this paper, we focused on metrics that are specific to lower granularity levels (Java classes and methods). Based on statistical analysis, we first identified a set of class‐level metrics and a set of method‐level metrics and then employed them as features in machine learning techniques to predict vulnerable classes and methods, respectively. This paper describes a comparative study on how our selected metrics perform at different granularity levels. Such a comparative study can help the developers in choosing the appropriate metrics (at the desired level of granularity). [ Objective ] The goal of this research is to propose a set of metrics at two lower granularity levels and provide evidence for their usefulness during vulnerability prediction (which will help in maintaining secure code and ensure secure software evolution). [ Method ] For four Java‐based open source systems (including two releases of Apache Tomcat), we designed and conducted experiments based on statistical tests to propose a set of software metrics that can be used for predicting vulnerable code components (i.e., vulnerable classes and methods). Next, we used our identified metrics as features to train supervised machine learning algorithms to classify Java code as vulnerable or non‐vulnerable. [ Result ] Our study has successfully identified a set of class‐level metrics and a second set of method‐level metrics that can be useful from a vulnerability prediction standpoint. We achieved recall higher than 70% and precision higher than 75% in vulnerability prediction using our identified class‐level metrics as features of machine learning. Furthermore, method‐level metrics showed recall higher than 65% and precision higher than 80%. Abstract : This paper proposes and empirically evaluates suite of software metrics that can be used as feature set to predict vulnerable code‐components at two levels of granularity: Java class‐level and method‐level. Software development teams can use the proposed suite of metrics to focus their security testing efforts on those code‐components that are predicted as vulnerable by the machine learning classifier, thereby reducing the overall testing time and effort. … (more)
- Is Part Of:
- Journal of software. Volume 33:Issue 3(2021)
- Journal:
- Journal of software
- Issue:
- Volume 33:Issue 3(2021)
- Issue Display:
- Volume 33, Issue 3 (2021)
- Year:
- 2021
- Volume:
- 33
- Issue:
- 3
- Issue Sort Value:
- 2021-0033-0003-0000
- Page Start:
- n/a
- Page End:
- n/a
- Publication Date:
- 2020-08-07
- Subjects:
- software evolution -- software maintenance -- software metrics -- software security -- vulnerability prediction
Software engineering -- Periodicals
Computer software -- Development -- Periodicals
Software maintenance -- Periodicals
005.1 - Journal URLs:
- http://onlinelibrary.wiley.com/journal/10.1002/(ISSN)2047-7481 ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1002/smr.2303 ↗
- Languages:
- English
- ISSNs:
- 2047-7473
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 15973.xml