A tutorial on generative adversarial networks with application to classification of imbalanced data. (31st December 2021)
- Record Type:
- Journal Article
- Title:
- A tutorial on generative adversarial networks with application to classification of imbalanced data. (31st December 2021)
- Main Title:
- A tutorial on generative adversarial networks with application to classification of imbalanced data
- Authors:
- Huang, Yuxiao
Fields, Kara G.
Ma, Yan - Abstract:
- Abstract: A challenge unique to classification model development is imbalanced data. In a binary classification problem, class imbalance occurs when one class, the minority group, contains significantly fewer samples than the other class, the majority group. In imbalanced data, the minority class is often the class of interest (e.g., patients with disease). However, when training a classifier on imbalanced data, the model will exhibit bias towards the majority class and, in extreme cases, may ignore the minority class completely. A common strategy for addressing class imbalance is data augmentation. However, traditional data augmentation methods are associated with overfitting, where the model is fit to the noise in the data. In this tutorial we introduce an advanced method for data augmentation: generative adversarial networks (GANs). The advantages of GANs over traditional data augmentation methods are illustrated using the Breast Cancer Wisconsin study. To promote the adoption of GANs for data augmentation, we present an end‐to‐end pipeline that encompasses the complete life cycle of a machine learning project along with alternatives and good practices both in the paper and in a separate video. Our code, data, full results and video tutorial are publicly available in the paper's GitHub repository (https://github.com/yuxiaohuang/research/tree/master/gwu/accepted/sam_2021 ).
- Is Part Of:
- Statistical analysis and data mining. Volume 15:Number 5(2022)
- Journal:
- Statistical analysis and data mining
- Issue:
- Volume 15:Number 5(2022)
- Issue Display:
- Volume 15, Issue 5 (2022)
- Year:
- 2022
- Volume:
- 15
- Issue:
- 5
- Issue Sort Value:
- 2022-0015-0005-0000
- Page Start:
- 543
- Page End:
- 552
- Publication Date:
- 2021-12-31
- Subjects:
- class imbalance -- classification -- data augmentation -- generative adversarial networks -- machine learning
Data mining -- Statistical methods -- Periodicals
006.312 - Journal URLs:
- http://www3.interscience.wiley.com/journal/112701062/home ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1002/sam.11570 ↗
- Languages:
- English
- ISSNs:
- 1932-1864
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 8447.424100
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 23293.xml