On the genealogy of machine learning datasets: A critical history of ImageNet. Issue 2 (July 2021)
- Record Type:
- Journal Article
- Title:
- On the genealogy of machine learning datasets: A critical history of ImageNet. Issue 2 (July 2021)
- Main Title:
- On the genealogy of machine learning datasets: A critical history of ImageNet
- Authors:
- Denton, Emily
Hanna, Alex
Amironesei, Razvan
Smart, Andrew
Nicole, Hilary - Abstract:
- In response to growing concerns of bias, discrimination, and unfairness perpetuated by algorithmic systems, the datasets used to train and evaluate machine learning models have come under increased scrutiny. Many of these examinations have focused on the contents of machine learning datasets, finding glaring underrepresentation of minoritized groups. In contrast, relatively little work has been done to examine the norms, values, and assumptions embedded in these datasets. In this work, we conceptualize machine learning datasets as a type of informational infrastructure, and motivate a genealogy as method in examining the histories and modes of constitution at play in their creation. We present a critical history of ImageNet as an exemplar, utilizing critical discourse analysis of major texts around ImageNet's creation and impact. We find that assumptions around ImageNet and other large computer vision datasets more generally rely on three themes: the aggregation and accumulation of more data, the computational construction of meaning, and making certain types of data labor invisible. By tracing the discourses that surround this influential benchmark, we contribute to the ongoing development of the standards and norms around data development in machine learning and artificial intelligence research.
- Is Part Of:
- Big data & society. Volume 8:Issue 2(2021)
- Journal:
- Big data & society
- Issue:
- Volume 8:Issue 2(2021)
- Issue Display:
- Volume 8, Issue 2 (2021)
- Year:
- 2021
- Volume:
- 8
- Issue:
- 2
- Issue Sort Value:
- 2021-0008-0002-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-07
- Subjects:
- Machine learning -- artificial intelligence -- big data -- AI ethics -- algorithmic fairness -- genealogy
Big data -- Social aspects -- Periodicals
Social sciences -- Research -- Data processing -- Periodicals
Social sciences -- Research -- Methodology -- Periodicals
Data mining -- Periodicals
300.28557 - Journal URLs:
- http://bds.sagepub.com ↗
http://www.uk.sagepub.com/home.nav ↗ - DOI:
- 10.1177/20539517211035955 ↗
- Languages:
- English
- ISSNs:
- 2053-9517
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 19791.xml