1344. Predicting Measles Outbreaks in the United States: Application of Different Modeling Approaches. (4th December 2021)
- Record Type:
- Journal Article
- Title:
- 1344. Predicting Measles Outbreaks in the United States: Application of Different Modeling Approaches. (4th December 2021)
- Main Title:
- 1344. Predicting Measles Outbreaks in the United States: Application of Different Modeling Approaches
- Authors:
- Kujawski, Stephanie
Ru, Boshu
Das, Amar K
Afanador, Nelson L
baumgartner, richard
Liu, Zhiwen
Lu, Shuang
Pillsbury, Matthew
Lewnard, Joseph
Conway, James H
Pawaskar, Manjiri D - Abstract:
- Abstract: Background: Although measles is still rare in the United States (U.S.), there have been recent resurgent outbreaks in the U.S. To improve the accuracy of prediction given the rarity of measles events, we used machine learning (ML) algorithms to model measles case predictions at the U.S. county level. Methods: The main outcome was occurrence of ≥1 measles case at the U.S. county level. Two ML prediction models were developed (HDBSCAN, a clustering algorithm, and XGBoost, a gradient boosting algorithm) and compared with traditional logistic regression. We included 28 predictors in the following categories: sociodemographics, population statistics, measles vaccination coverage, healthcare access, and exposure to measles via international air travel. The models were trained on 2014 case data and validated on 2018 case data. Models were compared using area under the receiver operating curve (AUC), sensitivity, specificity, positive predictive value (PPV), and F2 score (combined measure of sensitivity and PPV). Results: There were 667 measles cases in 2014 and 375 in 2018 in the U.S. We identified U.S. counties for 635 (95.2%) cases in 2014 and 366 (97.6%) cases in 2018 through published sources, corresponding to 81/3143 (2.6%) counties in 2014 and 64/3143 (2.0%) counties in 2018 with ≥1 measles case. HDBSCAN had the highest sensitivity (0.92), but lowest AUC (0.68) and PPV (0.04) (Table). XGBoost had the highest F2 score (0.49), best balance of sensitivity (0.72) andAbstract: Background: Although measles is still rare in the United States (U.S.), there have been recent resurgent outbreaks in the U.S. To improve the accuracy of prediction given the rarity of measles events, we used machine learning (ML) algorithms to model measles case predictions at the U.S. county level. Methods: The main outcome was occurrence of ≥1 measles case at the U.S. county level. Two ML prediction models were developed (HDBSCAN, a clustering algorithm, and XGBoost, a gradient boosting algorithm) and compared with traditional logistic regression. We included 28 predictors in the following categories: sociodemographics, population statistics, measles vaccination coverage, healthcare access, and exposure to measles via international air travel. The models were trained on 2014 case data and validated on 2018 case data. Models were compared using area under the receiver operating curve (AUC), sensitivity, specificity, positive predictive value (PPV), and F2 score (combined measure of sensitivity and PPV). Results: There were 667 measles cases in 2014 and 375 in 2018 in the U.S. We identified U.S. counties for 635 (95.2%) cases in 2014 and 366 (97.6%) cases in 2018 through published sources, corresponding to 81/3143 (2.6%) counties in 2014 and 64/3143 (2.0%) counties in 2018 with ≥1 measles case. HDBSCAN had the highest sensitivity (0.92), but lowest AUC (0.68) and PPV (0.04) (Table). XGBoost had the highest F2 score (0.49), best balance of sensitivity (0.72) and specificity (0.94), and AUC = 0.92. Logistic regression had high AUC (0.91) and specificity (1.00) but the lowest sensitivity (0.16). Conclusion: Machine learning approaches outperformed logistic regression by maximizing sensitivity to predict counties with measles cases, an important criterion to consider to prevent or prepare for future outbreaks. XGBoost or logistic regression could be considered to maximize specificity. Prioritizing sensitivity versus specificity may depend on county resources, priorities, and measles risk. Different modeling approaches could be considered to optimize surveillance efforts and develop effective interventions for timely response. Disclosures: Stephanie Kujawski, PhD MPH, Merck & Co., Inc. (Employee, Shareholder) Boshu Ru, Ph.D., Merck & Co. Kenilworth, NJ (NYSE: MRK ) (Employee, Shareholder) Amar K. Das, MD, PhD, Merck (Employee) richard baumgartner, PhD, Merck (Employee) Shuang Lu, MBA, MS, Merck (Employee) Matthew Pillsbury, PhD, Merck & CO. (Employee, Shareholder) Joseph Lewnard, PhD, Merck (Consultant, Grant/Research Support) James H. Conway, MD, FAAP, GSK (Advisor or Review Panel member)Merck (Advisor or Review Panel member)Moderna (Advisor or Review Panel member)Pfizer (Advisor or Review Panel member)Sanofi Pasteur (Research Grant or Support) Manjiri D. Pawaskar, PhD, Merck & Co., Inc. (Employee, Shareholder) … (more)
- Is Part Of:
- Open forum infectious diseases. Volume 8(2021)Supplement 1
- Journal:
- Open forum infectious diseases
- Issue:
- Volume 8(2021)Supplement 1
- Issue Display:
- Volume 8, Issue 1 (2021)
- Year:
- 2021
- Volume:
- 8
- Issue:
- 1
- Issue Sort Value:
- 2021-0008-0001-0000
- Page Start:
- S759
- Page End:
- S759
- Publication Date:
- 2021-12-04
- Subjects:
- Communicable diseases -- Periodicals
Medical microbiology -- Periodicals
Infection -- Periodicals
616.9 - Journal URLs:
- http://ofid.oxfordjournals.org/ ↗
http://www.oxfordjournals.org/en/ ↗ - DOI:
- 10.1093/ofid/ofab466.1536 ↗
- Languages:
- English
- ISSNs:
- 2328-8957
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 21267.xml