Development of a global infectious disease activity database using natural language processing, machine learning, and human expertise. (30th July 2019)
- Record Type:
- Journal Article
- Title:
- Development of a global infectious disease activity database using natural language processing, machine learning, and human expertise. (30th July 2019)
- Main Title:
- Development of a global infectious disease activity database using natural language processing, machine learning, and human expertise
- Authors:
- Feldman, Joshua
Thomas-Bachli, Andrea
Forsyth, Jack
Patel, Zaki Hasnain
Khan, Kamran - Abstract:
- Abstract: Objective: We assessed whether machine learning can be utilized to allow efficient extraction of infectious disease activity information from online media reports. Materials and Methods: We curated a data set of labeled media reports ( n = 8322) indicating which articles contain updates about disease activity. We trained a classifier on this data set. To validate our system, we used a held out test set and compared our articles to the World Health Organization Disease Outbreak News reports. Results: Our classifier achieved a recall and precision of 88.8% and 86.1%, respectively. The overall surveillance system detected 94% of the outbreaks identified by the WHO covered by online media (89%) and did so 43.4 (IQR: 9.5–61) days earlier on average. Discussion: We constructed a global real-time disease activity database surveilling 114 illnesses and syndromes. We must further assess our system for bias, representativeness, granularity, and accuracy. Conclusion: Machine learning, natural language processing, and human expertise can be used to efficiently identify disease activity from digital media reports.
- Is Part Of:
- Journal of the American Medical Informatics Association. Volume 26:Number 11(2019)
- Journal:
- Journal of the American Medical Informatics Association
- Issue:
- Volume 26:Number 11(2019)
- Issue Display:
- Volume 26, Issue 11 (2019)
- Year:
- 2019
- Volume:
- 26
- Issue:
- 11
- Issue Sort Value:
- 2019-0026-0011-0000
- Page Start:
- 1355
- Page End:
- 1359
- Publication Date:
- 2019-07-30
- Subjects:
- machine learning -- public health surveillance -- communicable diseases -- internet -- health information systems
Medical informatics -- Periodicals
Information Services -- Periodicals
Medical Informatics -- Periodicals
Médecine -- Informatique -- Périodiques
Informatica
Geneeskunde
Informatique médicale
Computer network resources
Electronic journals
610.285 - Journal URLs:
- http://jamia.bmj.com/ ↗
http://www.jamia.org ↗
http://www.pubmedcentral.nih.gov/tocrender.fcgi?journal=76 ↗
http://www.sciencedirect.com/science/journal/10675027 ↗
http://jamia.oxfordjournals.org/ ↗
http://www.oxfordjournals.org/en/ ↗ - DOI:
- 10.1093/jamia/ocz112 ↗
- Languages:
- English
- ISSNs:
- 1067-5027
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4689.025000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 15142.xml