IBFET: Index‐based features extraction technique for scalable code clone detection at file level granularity. (22nd October 2019)
- Record Type:
- Journal Article
- Title:
- IBFET: Index‐based features extraction technique for scalable code clone detection at file level granularity. (22nd October 2019)
- Main Title:
- IBFET: Index‐based features extraction technique for scalable code clone detection at file level granularity
- Authors:
- Akram, Junaid
Mumtaz, Majid
Luo, Ping - Abstract:
- Summary: Many techniques have been developed over the years to detect code clones in different software systems to maintain security measures. These techniques often require the source code to compare the subject system against a very large data set of big code. This paper presents index‐based features extraction technique (IBFET) to detect code clones at a very large‐scale level to billions of LOC at file level granularity. We performed preprocessing, indexing, and clone detection for more than 324 billion of LOC using a Hadoop distributed environment, which is quite faster and more efficient as compared to existing distributed indexing and clone detection techniques; meanwhile, it detects all three types of clones efficiently. The MapReduce rule of divide and conquer is used for a count and retrieve the similar features between different systems. We evaluated the execution time, scalability, precision, and recall of IBFET by using a well‐known clone detection data set IJaDataset and BigCloneBench; furthermore, we compared the results with other state‐of‐the‐art tools. Our approach is faster, flexible, scalable, and provides accurate results with high authenticity and can be implemented at a large‐scale level.
- Is Part Of:
- Software, practice & experience. Volume 50:Number 1(2020)
- Journal:
- Software, practice & experience
- Issue:
- Volume 50:Number 1(2020)
- Issue Display:
- Volume 50, Issue 1 (2020)
- Year:
- 2020
- Volume:
- 50
- Issue:
- 1
- Issue Sort Value:
- 2020-0050-0001-0000
- Page Start:
- 22
- Page End:
- 46
- Publication Date:
- 2019-10-22
- Subjects:
- big code -- clone detection -- code similarity detection -- plagiarism detection -- software reuse -- software security and maintenance
Computer software -- Periodicals
Computer programming -- Periodicals
Computer programs -- Periodicals
005.3 - Journal URLs:
- http://onlinelibrary.wiley.com/ ↗
- DOI:
- 10.1002/spe.2759 ↗
- Languages:
- English
- ISSNs:
- 0038-0644
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 8321.453000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 12435.xml