Investigating diversity and impact of the popularity metrics for ranking software packages. Issue 9 (31st March 2020)
- Record Type:
- Journal Article
- Title:
- Investigating diversity and impact of the popularity metrics for ranking software packages. Issue 9 (31st March 2020)
- Main Title:
- Investigating diversity and impact of the popularity metrics for ranking software packages
- Authors:
- Saini, Munish
Verma, Rohan
Singh, Antarpuneet
Chahal, Kuljit Kaur - Abstract:
- Abstract: Context: Community‐based collaborative approach in open source software paradigm promotes reuse of existing software packages. There are several repositories (e.g., npm) for packages and have their own set of metrics for ranking. Objective: This study explores the diversity of different popularity metrics and also the relationship between popularity metrics and development activity of the packages. Another aim is to create a package popularity index by aggregating a set of noncollinear popularity metrics. Method: Using 195 K packages from different repositories, we investigated the correlation between different popularity metrics. K ‐medoids algorithm helped to identify packages with different levels of popularity. Random forests method is utilized to create the package popularity index. Lastly, we used scikit‐learn implementation for determining feature importance in the model. Results: Popularity metrics of the Github platform are very strongly correlated ( R ≥ 0.85) for highly popular packages. Popular packages have high‐development activity. However, the number of downloads of a package does not associate with development activity. Not all the metrics are important for determining popularity of a software package. Conclusion: This study provides practical guidelines to understand important metrics to determine the popularity of software packages. Researchers should focus on non‐collinear metrics, thereby avoiding similar metrics while aggregating for buildingAbstract: Context: Community‐based collaborative approach in open source software paradigm promotes reuse of existing software packages. There are several repositories (e.g., npm) for packages and have their own set of metrics for ranking. Objective: This study explores the diversity of different popularity metrics and also the relationship between popularity metrics and development activity of the packages. Another aim is to create a package popularity index by aggregating a set of noncollinear popularity metrics. Method: Using 195 K packages from different repositories, we investigated the correlation between different popularity metrics. K ‐medoids algorithm helped to identify packages with different levels of popularity. Random forests method is utilized to create the package popularity index. Lastly, we used scikit‐learn implementation for determining feature importance in the model. Results: Popularity metrics of the Github platform are very strongly correlated ( R ≥ 0.85) for highly popular packages. Popular packages have high‐development activity. However, the number of downloads of a package does not associate with development activity. Not all the metrics are important for determining popularity of a software package. Conclusion: This study provides practical guidelines to understand important metrics to determine the popularity of software packages. Researchers should focus on non‐collinear metrics, thereby avoiding similar metrics while aggregating for building models. Abstract : … (more)
- Is Part Of:
- Journal of software. Volume 32:Issue 9(2020)
- Journal:
- Journal of software
- Issue:
- Volume 32:Issue 9(2020)
- Issue Display:
- Volume 32, Issue 9 (2020)
- Year:
- 2020
- Volume:
- 32
- Issue:
- 9
- Issue Sort Value:
- 2020-0032-0009-0000
- Page Start:
- n/a
- Page End:
- n/a
- Publication Date:
- 2020-03-31
- Subjects:
- cluster analysis -- development activity -- metrics -- open source software -- popularity -- random forests -- software packages
Software engineering -- Periodicals
Computer software -- Development -- Periodicals
Software maintenance -- Periodicals
005.1 - Journal URLs:
- http://onlinelibrary.wiley.com/journal/10.1002/(ISSN)2047-7481 ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1002/smr.2265 ↗
- Languages:
- English
- ISSNs:
- 2047-7473
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 13988.xml