Improving Imbalanced Machine Learning with Neighborhood-Informed Synthetic Sample Placement. Issue 4 (2nd October 2022)
- Record Type:
- Journal Article
- Title:
- Improving Imbalanced Machine Learning with Neighborhood-Informed Synthetic Sample Placement. Issue 4 (2nd October 2022)
- Main Title:
- Improving Imbalanced Machine Learning with Neighborhood-Informed Synthetic Sample Placement
- Authors:
- Nasir, Murtaza
Dag, Ali
Simsek, Serhat
Ivanov, Anton
Oztekin, Asil - Abstract:
- ABSTRACT: Machine learning is widely used in information systems design. Yet, training algorithms on imbalanced datasets may severely affect performance on unseen data. For example, in some cases in healthcare, fintech, or cybersecurity contexts, certain subclasses are difficult to learn because they are underrepresented in training data. Our study offers a flexible and efficient solution based on a new synthetic average neighborhood sampling algorithm (SANSA), which, in contrast to other solutions, introduces a novel "placement" parameter that can be tuned to adapt to each dataset's unique manifestation of the imbalance. This package can be downloaded for R 1 . We tested SANSA against seven existing sampling methods used in conjunction with the four most frequently used machine learning models trained on 14 benchmark datasets. Our results provide suggestive evidence that SANSA offers a feasible solution to the imbalance problem for most datasets. Our findings provide practical recommendations for how SANSA can be effectively implemented while reducing the complexity level of an imbalanced learning pipeline.
- Is Part Of:
- Journal of management information systems. Volume 39:Issue 4(2022)
- Journal:
- Journal of management information systems
- Issue:
- Volume 39:Issue 4(2022)
- Issue Display:
- Volume 39, Issue 4 (2022)
- Year:
- 2022
- Volume:
- 39
- Issue:
- 4
- Issue Sort Value:
- 2022-0039-0004-0000
- Page Start:
- 1116
- Page End:
- 1145
- Publication Date:
- 2022-10-02
- Subjects:
- Imbalanced data -- oversampling -- undersampling -- machine learning -- predictive analytics -- classification prediction performance -- algorithm training
Management information systems -- Periodicals
Management information systems
Periodicals
658.4038011 - Journal URLs:
- http://www.tandfonline.com/loi/mmis20#.V2kZarn2bcs ↗
http://www.jstor.org/journals/07421222.html ↗
http://www.tandfonline.com/ ↗
http://firstsearch.oclc.org/journal=0742-1222;screen=info;ECOIP ↗ - DOI:
- 10.1080/07421222.2022.2127453 ↗
- Languages:
- English
- ISSNs:
- 0742-1222
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 5011.350000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital Store - Ingest File:
- 24791.xml