Decisions are not all equal—Introducing a utility metric based on case-wise raters' perceptions. (June 2022)
- Record Type:
- Journal Article
- Title:
- Decisions are not all equal—Introducing a utility metric based on case-wise raters' perceptions. (June 2022)
- Main Title:
- Decisions are not all equal—Introducing a utility metric based on case-wise raters' perceptions
- Authors:
- Campagner, Andrea
Sternini, Federico
Cabitza, Federico - Abstract:
- Highlights: We propose a human-centred utility metrics for the evaluation of medical AI systems. We show that the proposed metric generalizes existing proposals in the literature. We illustrate the application of our metric in three realistic medical case studies. We make the point for a human-centred approach to assess the impact of AI models. Abstract: Background and Objective Evaluation of AI-based decision support systems (AI-DSS) is of critical importance in practical applications, nonetheless common evaluation metrics fail to properly consider relevant and contextual information. In this article we discuss a novel utility metric, the weighted Utility (wU), for the evaluation of AI-DSS, which is based on the raters' perceptions of their annotation hesitation and of the relevance of the training cases. Methods We discuss the relationship between the proposed metric and other previous proposals; and we describe the application of the proposed metric for both model evaluation and optimization, through three realistic case studies. Results We show that our metric generalizes the well-known Net Benefit, as well as other common error-based and utility-based metrics. Through the empirical studies, we show that our metric can provide a more flexible tool for the evaluation of AI models. We also show that, compared to other optimization metrics, model optimization based on the w U can provide significantly better performance (AUC 0.862 vs 0.895, p-value < 0.05 ), especially onHighlights: We propose a human-centred utility metrics for the evaluation of medical AI systems. We show that the proposed metric generalizes existing proposals in the literature. We illustrate the application of our metric in three realistic medical case studies. We make the point for a human-centred approach to assess the impact of AI models. Abstract: Background and Objective Evaluation of AI-based decision support systems (AI-DSS) is of critical importance in practical applications, nonetheless common evaluation metrics fail to properly consider relevant and contextual information. In this article we discuss a novel utility metric, the weighted Utility (wU), for the evaluation of AI-DSS, which is based on the raters' perceptions of their annotation hesitation and of the relevance of the training cases. Methods We discuss the relationship between the proposed metric and other previous proposals; and we describe the application of the proposed metric for both model evaluation and optimization, through three realistic case studies. Results We show that our metric generalizes the well-known Net Benefit, as well as other common error-based and utility-based metrics. Through the empirical studies, we show that our metric can provide a more flexible tool for the evaluation of AI models. We also show that, compared to other optimization metrics, model optimization based on the w U can provide significantly better performance (AUC 0.862 vs 0.895, p-value < 0.05 ), especially on cases judged to be more complex by the human annotators (AUC 0.85 vs 0.92, p-value < 0.05 ). Conclusions We make the point for having utility as a primary concern in the evaluation and optimization of machine learning models in critical domains, like the medical one; and for the importance of a human-centred approach to assess the potential impact of AI models on human decision making also on the basis of further information that can be collected during the ground-truthing process. … (more)
- Is Part Of:
- Computer methods and programs in biomedicine. Volume 221(2022)
- Journal:
- Computer methods and programs in biomedicine
- Issue:
- Volume 221(2022)
- Issue Display:
- Volume 221, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 221
- Issue:
- 2022
- Issue Sort Value:
- 2022-0221-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-06
- Subjects:
- Utility -- Machine learning -- Validation -- Evaluation -- Human-centred -- Health informatics
Medicine -- Computer programs -- Periodicals
Biology -- Computer programs -- Periodicals
Computers -- Periodicals
Medicine -- Periodicals
Médecine -- Logiciels -- Périodiques
Biologie -- Logiciels -- Périodiques
Biology -- Computer programs
Medicine -- Computer programs
Periodicals
Electronic journals
610.28 - Journal URLs:
- http://www.sciencedirect.com/science/journal/01692607 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.cmpb.2022.106930 ↗
- Languages:
- English
- ISSNs:
- 0169-2607
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.095000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 22255.xml