Identifying unusual commits on GitHub. Issue 1 (12th September 2017)
- Record Type:
- Journal Article
- Title:
- Identifying unusual commits on GitHub. Issue 1 (12th September 2017)
- Main Title:
- Identifying unusual commits on GitHub
- Authors:
- Goyal, Raman
Ferreira, Gabriel
Kästner, Christian
Herbsleb, James - Abstract:
- Abstract: Transparent environments and social‐coding platforms as GitHub help developers to stay abreast of changes during the development and maintenance phase of a project. Especially, notification feeds can help developers to learn about relevant changes in other projects. Unfortunately, transparent environments can quickly overwhelm developers with too many notifications, such that they lose the important ones in a sea of noise. Complementing existing prioritization and filtering strategies based on binary compatibility and code ownership, we develop an anomaly detection mechanism to identify unusual commits in a repository, which stand out with respect to other changes in the same repository or by the same developer. Among others, we detect exceptionally large commits, commits at unusual times, and commits touching rarely changed file types given the characteristics of a particular repository or developer. We automatically flag unusual commits on GitHub through a browser plug‐in. In an interactive survey with 173 active GitHub users, rating commits in a project of their interest, we found that, although our unusual score is only a weak predictor of whether developers want to be notified about a commit, information about unusual characteristics of a commit changes how developers regard commits. Our anomaly detection mechanism is a building block for scaling transparent environments. Abstract : Key Findings: We design an anomaly model based on commit characteristics toAbstract: Transparent environments and social‐coding platforms as GitHub help developers to stay abreast of changes during the development and maintenance phase of a project. Especially, notification feeds can help developers to learn about relevant changes in other projects. Unfortunately, transparent environments can quickly overwhelm developers with too many notifications, such that they lose the important ones in a sea of noise. Complementing existing prioritization and filtering strategies based on binary compatibility and code ownership, we develop an anomaly detection mechanism to identify unusual commits in a repository, which stand out with respect to other changes in the same repository or by the same developer. Among others, we detect exceptionally large commits, commits at unusual times, and commits touching rarely changed file types given the characteristics of a particular repository or developer. We automatically flag unusual commits on GitHub through a browser plug‐in. In an interactive survey with 173 active GitHub users, rating commits in a project of their interest, we found that, although our unusual score is only a weak predictor of whether developers want to be notified about a commit, information about unusual characteristics of a commit changes how developers regard commits. Our anomaly detection mechanism is a building block for scaling transparent environments. Abstract : Key Findings: We design an anomaly model based on commit characteristics to identify unusual commits using statistical learning methods in a repository and by a developer. We integrate anomaly scores and explanations into the GitHub web page using an implementation based on a browser plug‐in. We design an experimental setup and evaluate our anomaly model with 173 GitHub developers to learn about the importance of unusual commits in a repository of the participant's choice. … (more)
- Is Part Of:
- Journal of software. Volume 30:Issue 1(2018)
- Journal:
- Journal of software
- Issue:
- Volume 30:Issue 1(2018)
- Issue Display:
- Volume 30, Issue 1 (2018)
- Year:
- 2018
- Volume:
- 30
- Issue:
- 1
- Issue Sort Value:
- 2018-0030-0001-0000
- Page Start:
- n/a
- Page End:
- n/a
- Publication Date:
- 2017-09-12
- Subjects:
- anomaly detection -- information overload -- notification feeds -- software ecosystems -- transparent environments
Software engineering -- Periodicals
Computer software -- Development -- Periodicals
Software maintenance -- Periodicals
005.1 - Journal URLs:
- http://onlinelibrary.wiley.com/journal/10.1002/(ISSN)2047-7481 ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1002/smr.1893 ↗
- Languages:
- English
- ISSNs:
- 2047-7473
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 5718.xml