An Earth mover's distance-based undersampling approach for handling class-imbalanced data. (26th August 2020)
- Record Type:
- Journal Article
- Title:
- An Earth mover's distance-based undersampling approach for handling class-imbalanced data. (26th August 2020)
- Main Title:
- An Earth mover's distance-based undersampling approach for handling class-imbalanced data
- Authors:
- Rekha, Gillala
Reddy, V. Krishna
Tyagi, Amit Kumar - Abstract:
- Imbalanced datasets typically make prediction accuracy difficult. Most of the real-world data are imbalanced in nature. The traditional classifiers assume a well-balanced class distribution for training data but in practical datasets show up an imbalance, thus obscure a classifier and degrade its capability to learn from such imbalanced datasets. Data pre-processing approaches address this concern by using either random undersampling or oversampling techniques. In this paper, we introduce Earth mover's distance (EMD), as a similarity measure, to find the samples similar in nature and eliminate them as redundant from the dataset. Earth mover's distance has received a lot of attention in wide areas such as computer vision, image retrieval, machine learning, etc. The Earth mover's distance-based undersampling approach provides a solution at the data level to eliminate the redundant instances in majority samples without any loss of valuable information. This method is implemented with five conventional classifiers and one ensemble technique respectively, like C4.5 decision tree (DT), k-nearest neighbour (k-NN), multilayer perceptron (MLP), support vector machine (SVM), naive Bayes (NB) and AdaBoost technique. The proposed method yields a superior performance on 21 datasets from Keel repository.
- Is Part Of:
- International journal of intelligent information and database systems. Volume 13:Number 2/4(2020)
- Journal:
- International journal of intelligent information and database systems
- Issue:
- Volume 13:Number 2/4(2020)
- Issue Display:
- Volume 13, Issue 2, Part 4 (2020)
- Year:
- 2020
- Volume:
- 13
- Issue:
- 2
- Part:
- 4
- Issue Sort Value:
- 2020-0013-0002-0004
- Page Start:
- 376
- Page End:
- 392
- Publication Date:
- 2020-08-26
- Subjects:
- class imbalance -- classification -- data pre-processing -- sampling technique -- Earth mover's distance -- EMD
Database management -- Computer programs -- Periodicals
Information retrieval -- Computer programs -- Periodicals
Information storage and retrieval systems -- Computer programs -- Periodicals
Artificial intelligence -- Periodicals
Expert systems (Computer science) -- Periodicals
Intelligent agents (Computer software) -- Periodicals
006.33 - Journal URLs:
- http://www.inderscience.com/jhome.php?jcode=ijiids ↗
http://www.inderscience.com/ ↗ - Languages:
- English
- ISSNs:
- 1751-5858
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 13952.xml