A novel hybrid undersampling method for mining unbalanced datasets in banking and insurance. (January 2015)
- Record Type:
- Journal Article
- Title:
- A novel hybrid undersampling method for mining unbalanced datasets in banking and insurance. (January 2015)
- Main Title:
- A novel hybrid undersampling method for mining unbalanced datasets in banking and insurance
- Authors:
- Sundarkumar, G. Ganesh
Ravi, Vadlamani - Abstract:
- Abstract: In this paper, we propose a novel hybrid approach for rectifying the data imbalance problem by employing k Reverse Nearest Neighborhood and One Class support vector machine (OCSVM) in tandem. We mined an Automobile Insurance Fraud detection dataset and customer Credit Card Churn prediction dataset to demonstrate the effectiveness of the proposed model. Throughout the paper, we followed 10 fold cross validation method of testing using Decision Tree (DT), Support Vector Machine (SVM), Logistic Regression (LR), Probabilistic Neural Network (PNN), Group Method of Data Handling (GMDH), Multi-Layer Perceptron (MLP). We observed that DT and SVM respectively yielded high sensitivity of 90.74% and 91.89% on Insurance dataset and DT, SVM and GMDH respectively produced high sensitivity of 91.2%, 87.7%, and 83.1% on Credit Card Churn Prediction dataset. In the case of Insurance Fraud detection dataset, we found that statistically there is no significant difference between DT (J48) and SVM. As DT yields "if then" rules, we prefer DT over SVM. Further, in the case of churn prediction dataset, it turned out that GMDH, SVM and LR are not statistically different and GMDH yielded very high Area Under Curve at ROC. Further, DT yielded just 4 'if–then' rules on Insurance and 10 rules on churn prediction datasets, which is the significant outcome of the study.
- Is Part Of:
- Engineering applications of artificial intelligence. Volume 37(2015:Jan.)
- Journal:
- Engineering applications of artificial intelligence
- Issue:
- Volume 37(2015:Jan.)
- Issue Display:
- Volume 37 (2015)
- Year:
- 2015
- Volume:
- 37
- Issue Sort Value:
- 2015-0037-0000-0000
- Page Start:
- 368
- Page End:
- 377
- Publication Date:
- 2015-01
- Subjects:
- Insurance fraud detection -- Credit card churn prediction -- Undersampling -- K- Reverse Nearest Neighbourhood method -- One-class support vector machine
Engineering -- Data processing -- Periodicals
Artificial intelligence -- Periodicals
Expert systems (Computer science) -- Periodicals
Ingénierie -- Informatique -- Périodiques
Intelligence artificielle -- Périodiques
Systèmes experts (Informatique) -- Périodiques
Artificial intelligence
Engineering -- Data processing
Expert systems (Computer science)
Periodicals
620.00285 - Journal URLs:
- http://www.sciencedirect.com/science/journal/09521976 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.engappai.2014.09.019 ↗
- Languages:
- English
- ISSNs:
- 0952-1976
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3755.704500
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 14580.xml