Revisiting performance metrics for prediction with rare outcomes. (October 2021)
- Record Type:
- Journal Article
- Title:
- Revisiting performance metrics for prediction with rare outcomes. (October 2021)
- Main Title:
- Revisiting performance metrics for prediction with rare outcomes
- Authors:
- Adhikari, Samrachana
Normand, Sharon-Lise
Bloom, Jordan
Shahian, David
Rose, Sherri - Abstract:
- Machine learning algorithms are increasingly used in the clinical literature, claiming advantages over logistic regression. However, they are generally designed to maximize the area under the receiver operating characteristic curve. While area under the receiver operating characteristic curve and other measures of accuracy are commonly reported for evaluating binary prediction problems, these metrics can be misleading. We aim to give clinical and machine learning researchers a realistic medical example of the dangers of relying on a single measure of discriminatory performance to evaluate binary prediction questions. Prediction of medical complications after surgery is a frequent but challenging task because many post-surgery outcomes are rare. We predicted post-surgery mortality among patients in a clinical registry who received at least one aortic valve replacement. Estimation incorporated multiple evaluation metrics and algorithms typically regarded as performing well with rare outcomes, as well as an ensemble and a new extension of the lasso for multiple unordered treatments. Results demonstrated high accuracy for all algorithms with moderate measures of cross-validated area under the receiver operating characteristic curve. False positive rates were< 1%, however, true positive rates were< 7%, even when paired with a 100% positive predictive value, and graphical representations of calibration were poor. Similar results were seen in simulations, with the addition of highMachine learning algorithms are increasingly used in the clinical literature, claiming advantages over logistic regression. However, they are generally designed to maximize the area under the receiver operating characteristic curve. While area under the receiver operating characteristic curve and other measures of accuracy are commonly reported for evaluating binary prediction problems, these metrics can be misleading. We aim to give clinical and machine learning researchers a realistic medical example of the dangers of relying on a single measure of discriminatory performance to evaluate binary prediction questions. Prediction of medical complications after surgery is a frequent but challenging task because many post-surgery outcomes are rare. We predicted post-surgery mortality among patients in a clinical registry who received at least one aortic valve replacement. Estimation incorporated multiple evaluation metrics and algorithms typically regarded as performing well with rare outcomes, as well as an ensemble and a new extension of the lasso for multiple unordered treatments. Results demonstrated high accuracy for all algorithms with moderate measures of cross-validated area under the receiver operating characteristic curve. False positive rates were< 1%, however, true positive rates were< 7%, even when paired with a 100% positive predictive value, and graphical representations of calibration were poor. Similar results were seen in simulations, with the addition of high area under the receiver operating characteristic curve (> 90%) accompanying low true positive rates. Clinical studies should not primarily report only area under the receiver operating characteristic curve or accuracy. … (more)
- Is Part Of:
- Statistical methods in medical research. Volume 30:Number 10(2021)
- Journal:
- Statistical methods in medical research
- Issue:
- Volume 30:Number 10(2021)
- Issue Display:
- Volume 30, Issue 10 (2021)
- Year:
- 2021
- Volume:
- 30
- Issue:
- 10
- Issue Sort Value:
- 2021-0030-0010-0000
- Page Start:
- 2352
- Page End:
- 2366
- Publication Date:
- 2021-10
- Subjects:
- Prediction -- classification -- machine learning -- ensembles -- mortality
Medicine -- Research -- Statistical methods -- Periodicals
Research -- Periodicals
Review Literature -- Periodicals
Statistics -- methods -- Periodicals
Médecine -- Recherche -- Méthodes statistiques -- Périodiques
610.727 - Journal URLs:
- http://smm.sagepub.com/ ↗
http://www.ingentaselect.com/rpsv/cw/arn/09622802/contp1.htm ↗
http://www.uk.sagepub.com/home.nav ↗
http://firstsearch.oclc.org ↗
http://firstsearch.oclc.org/journal=0962-2802;screen=info;ECOIP ↗ - DOI:
- 10.1177/09622802211038754 ↗
- Languages:
- English
- ISSNs:
- 0962-2802
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 17371.xml