The risk of racial bias while tracking influenza-related content on social media using machine learning. (23rd January 2021)

Record Type:: Journal Article
Title:: The risk of racial bias while tracking influenza-related content on social media using machine learning. (23rd January 2021)
Main Title:: The risk of racial bias while tracking influenza-related content on social media using machine learning
Authors:: Lwowski, Brandon
Rios, Anthony
Abstract:: Abstract: Objective: Machine learning is used to understand and track influenza-related content on social media. Because these systems are used at scale, they have the potential to adversely impact the people they are built to help. In this study, we explore the biases of different machine learning methods for the specific task of detecting influenza-related content. We compare the performance of each model on tweets written in Standard American English (SAE) vs African American English (AAE). Materials and Methods: Two influenza-related datasets are used to train 3 text classification models (support vector machine, convolutional neural network, bidirectional long short-term memory) with different feature sets. The datasets match real-world scenarios in which there is a large imbalance between SAE and AAE examples. The number of AAE examples for each class ranges from 2% to 5% in both datasets. We also evaluate each model's performance using a balanced dataset via undersampling. Results: We find that all of the tested machine learning methods are biased on both datasets. The difference in false positive rates between SAE and AAE examples ranges from 0.01 to 0.35. The difference in the false negative rates ranges from 0.01 to 0.23. We also find that the neural network methods generally has more unfair results than the linear support vector machine on the chosen datasets. Conclusions: The models that result in the most unfair predictions may vary from dataset to dataset. … (more)
Is Part Of:: Journal of the American Medical Informatics Association. Volume 28:Number 4(2021)
Journal:: Journal of the American Medical Informatics Association
Issue:: Volume 28:Number 4(2021)
Issue Display:: Volume 28, Issue 4 (2021)
Year:: 2021
Volume:: 28
Issue:: 4
Issue Sort Value:: 2021-0028-0004-0000
Page Start:: 839
Page End:: 849
Publication Date:: 2021-01-23
Subjects:: deep learning -- classification -- machine learning -- social network -- fairness
Medical informatics -- Periodicals
Information Services -- Periodicals
Medical Informatics -- Periodicals
Médecine -- Informatique -- Périodiques
Informatica
Geneeskunde
Informatique médicale
Computer network resources
Electronic journals
610.285
Journal URLs:: http://jamia.bmj.com/ ↗
http://www.jamia.org ↗
http://www.pubmedcentral.nih.gov/tocrender.fcgi?journal=76 ↗
http://www.sciencedirect.com/science/journal/10675027 ↗
http://jamia.oxfordjournals.org/ ↗
http://www.oxfordjournals.org/en/ ↗
DOI:: 10.1093/jamia/ocaa326 ↗
Languages:: English
ISSNs:: 1067-5027
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 4689.025000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store
Ingest File:: 15955.xml