A hybrid under-sampling approach for mining unbalanced datasets: applications to banking and insurance. (1st January 2011)
- Record Type:
- Journal Article
- Title:
- A hybrid under-sampling approach for mining unbalanced datasets: applications to banking and insurance. (1st January 2011)
- Main Title:
- A hybrid under-sampling approach for mining unbalanced datasets: applications to banking and insurance
- Authors:
- Vasu, Madireddi
Ravi, Vadlamani - Abstract:
- In solving unbalanced classification problems, machine learning algorithms are overwhelmed by the majority class and consequently misclassify the minority class observations. Here, we propose a hybrid under-sampling approach to improve the performance of classifiers. The proposed approach first employs k -reverse nearest neighbour (kRNN) method to detect the outliers from majority class. After removing the outliers, using K-means clustering, K-clusters are selected to further reduce the influence of the majority class. Then, we employed support vector machine (SVM), logistic regression (LR), multi layer perceptron (MLP), radial basis function network (RBF), group method of data handling (GMDH), genetic programming (GP) and decision tree (J48) for classification purpose. The effectiveness of the proposed approach was demonstrated on datasets taken from insurance fraud detection and credit card churn in banking domain. Ten-fold cross validation method was used in the study. It is observed that the proposed approach improved the performance of the classifiers.
- Is Part Of:
- International journal of data mining, modelling and management. Volume 3:Number 1(2011)
- Journal:
- International journal of data mining, modelling and management
- Issue:
- Volume 3:Number 1(2011)
- Issue Display:
- Volume 3, Issue 1 (2011)
- Year:
- 2011
- Volume:
- 3
- Issue:
- 1
- Issue Sort Value:
- 2011-0003-0001-0000
- Page Start:
- 75
- Page End:
- 105
- Publication Date:
- 2011-01-01
- Subjects:
- insurance fraud detection -- credit card churn prediction -- data mining -- unbalanced datasets -- machine learning algorithms
Data mining -- Periodicals
Information science -- Periodicals
Databases -- Periodicals
005.7 - Journal URLs:
- http://www.inderscience.com/jhome.php?jcode=ijdmmm ↗
http://www.inderscience.com/ ↗ - DOI:
- 10.1504/IJDMMM.2011.038812 ↗
- Languages:
- English
- ISSNs:
- 1759-1163
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 5604.xml