What (not) to expect when classifying rare events. (16th November 2016)
- Record Type:
- Journal Article
- Title:
- What (not) to expect when classifying rare events. (16th November 2016)
- Main Title:
- What (not) to expect when classifying rare events
- Authors:
- Blagus, Rok
Goeman, Jelle J - Abstract:
- Abstract: When building classifiers, it is natural to require that the classifier correctly estimates the event probability (Constraint 1), that it has equal sensitivity and specificity (Constraint 2) or that it has equal positive and negative predictive values (Constraint 3). We prove that in the balanced case, where there is equal proportion of events and non-events, any classifier that satisfies one of these constraints will always satisfy all. Such unbiasedness of events and non-events is much more difficult to achieve in the case of rare events, i.e. the situation in which the proportion of events is (much) smaller than 0.5. Here, we prove that it is impossible to meet all three constraints unless the classifier achieves perfect predictions. Any non-perfect classifier can only satisfy at most one constraint, and satisfying one constraint implies violating the other two constraints in a specific direction. Our results have implications for classifiers optimized using g -means or F 1 -measure, which tend to satisfy Constraints 2 and 1, respectively. Our results are derived from basic probability theory and illustrated with simulations based on some frequently used classifiers.
- Is Part Of:
- Briefings in bioinformatics. Volume 19:Number 2(2018:Mar.)
- Journal:
- Briefings in bioinformatics
- Issue:
- Volume 19:Number 2(2018:Mar.)
- Issue Display:
- Volume 19, Issue 2 (2018)
- Year:
- 2018
- Volume:
- 19
- Issue:
- 2
- Issue Sort Value:
- 2018-0019-0002-0000
- Page Start:
- 341
- Page End:
- 349
- Publication Date:
- 2016-11-16
- Subjects:
- prediction models -- rare events -- optimization -- overestimation -- balanced sensitivity and specificity -- g-means
Genetics -- Data processing -- Periodicals
Molecular biology -- Data processing -- Periodicals
Genomes -- Data processing -- Periodicals
572.80285 - Journal URLs:
- http://bib.oxfordjournals.org ↗
http://www.oxfordjournals.org/content?genre=journal&issn=1477-4054 ↗
http://ukcatalogue.oup.com/ ↗
http://firstsearch.oclc.org ↗ - DOI:
- 10.1093/bib/bbw107 ↗
- Languages:
- English
- ISSNs:
- 1467-5463
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 2283.958363
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 14241.xml