An in‐depth study of the effects of methods on the dataset selection of public development projects. Issue 2 (25th November 2021)
- Record Type:
- Journal Article
- Title:
- An in‐depth study of the effects of methods on the dataset selection of public development projects. Issue 2 (25th November 2021)
- Main Title:
- An in‐depth study of the effects of methods on the dataset selection of public development projects
- Authors:
- Cheng, Can
Li, Bing
Li, Zengyang
Liang, Peng
Yang, Xu - Abstract:
- Abstract: Public development projects (PDPs) and documented public development projects (DPDPs) are two types of projects that can provide valuable information on how developers and users participate in OSS projects. However, it is hard for researchers to effectively select PDPs and DPDPs due to the lack of specific project selection methods for these two types of projects. To address this problem, a standard dataset was labelled and the base line methods (i.e. selecting projects according to a single feature like star number) under 60 configurations and the machine learning methods under 18 configurations were tested to identify the best configurations in precision and F‐measure for selecting PDPs and DPDPs. The results show that (1) to select PDPs or DPDPs with a high precision, the base line method is the best with precision of 0.877 (PDPs) and 0.831 (DPDPs); (2) to select PDPs or DPDPs with a high F‐measure, the machine learning methods are the best, with F‐measure of 0.817 (PDPs) and 0.789 (DPDPs); (3) existing sample selection strategies can be combined with the machine learning methods, and the precision of selecting PDPs can be increased by 6.39%–41.33% and the precision of selecting DPDPs can be can be increased by 35.50%–269.02%.
- Is Part Of:
- IET software. Volume 16:Issue 2(2022)
- Journal:
- IET software
- Issue:
- Volume 16:Issue 2(2022)
- Issue Display:
- Volume 16, Issue 2 (2022)
- Year:
- 2022
- Volume:
- 16
- Issue:
- 2
- Issue Sort Value:
- 2022-0016-0002-0000
- Page Start:
- 146
- Page End:
- 166
- Publication Date:
- 2021-11-25
- Subjects:
- data mining -- software engineering
Computer software -- Periodicals
Software engineering -- Periodicals
005.1 - Journal URLs:
- http://digital-library.theiet.org/content/journals/iet-sen ↗
http://ieeexplore.ieee.org/servlet/opac?punumber=4124007 ↗
https://ietresearch.onlinelibrary.wiley.com/journal/17518814 ↗
http://www.theiet.org/ ↗
http://scitation.aip.org/dbt/dbt.jsp?KEY=ISEOB7&Volume=CURVOL&Issue=CURISS ↗ - DOI:
- 10.1049/sfw2.12050 ↗
- Languages:
- English
- ISSNs:
- 1751-8806
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4363.253550
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 26358.xml