Leveraging machine learning for software redocumentation—A comprehensive comparison of methods in practice. (10th November 2020)
- Record Type:
- Journal Article
- Title:
- Leveraging machine learning for software redocumentation—A comprehensive comparison of methods in practice. (10th November 2020)
- Main Title:
- Leveraging machine learning for software redocumentation—A comprehensive comparison of methods in practice
- Authors:
- Geist, Verena
Moser, Michael
Pichler, Josef
Santos, Rodolfo
Wieser, Volkmar - Other Names:
- Bishop Judith guestEditor.
Cooper Kendra M.L. guestEditor.
Kim Moonzoo guestEditor.
Koziolek Heiko guestEditor. - Abstract:
- Abstract: Source code comments contain key information about the underlying software system. Many redocumentation approaches, however, cannot exploit this valuable source of information. This is mainly due to the fact that not all comments have the same goals and target audience and can therefore only be used selectively for redocumentation. Performing a required classification manually, for example, in the form of heuristics, is usually time‐consuming and error‐prone and strongly dependent on programming languages and guidelines of concrete software systems. By leveraging machine learning (ML), it should be possible to classify comments and thus transfer valuable information from the source code into documentation with less effort but the same quality. We applied classical ML techniques but also deep learning (DL) approaches to legacy systems by transferring source code comments into meaningful representations using, for example, word embeddings but also novel approaches using quick response codes or a special character‐to‐image encoding. The results were compared with industry‐strength heuristic classification. As a result, we found that ML outperforms the heuristics in number of errors and less effort, that is, we finally achieve an accuracy of more than 95% for an image‐based DL network and even over 96% for a traditional approach using a random forest classifier.
- Is Part Of:
- Software, practice & experience. Volume 51:Number 4(2021)
- Journal:
- Software, practice & experience
- Issue:
- Volume 51:Number 4(2021)
- Issue Display:
- Volume 51, Issue 4 (2021)
- Year:
- 2021
- Volume:
- 51
- Issue:
- 4
- Issue Sort Value:
- 2021-0051-0004-0000
- Page Start:
- 798
- Page End:
- 823
- Publication Date:
- 2020-11-10
- Subjects:
- comment classification -- heuristics -- legacy systems -- machine learning -- QR code -- software system documentation
Computer software -- Periodicals
Computer programming -- Periodicals
Computer programs -- Periodicals
005.3 - Journal URLs:
- http://onlinelibrary.wiley.com/ ↗
- DOI:
- 10.1002/spe.2933 ↗
- Languages:
- English
- ISSNs:
- 0038-0644
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 8321.453000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 15976.xml