Early prediction of SARS-CoV-2 reproductive number from environmental, atmospheric and mobility data: A supervised machine learning approach. (June 2022)
- Record Type:
- Journal Article
- Title:
- Early prediction of SARS-CoV-2 reproductive number from environmental, atmospheric and mobility data: A supervised machine learning approach. (June 2022)
- Main Title:
- Early prediction of SARS-CoV-2 reproductive number from environmental, atmospheric and mobility data: A supervised machine learning approach
- Authors:
- Caruso, Pier Francesco
Angelotti, Giovanni
Greco, Massimiliano
Guzzetta, Giorgio
Cereda, Danilo
Merler, Stefano
Cecconi, Maurizio - Abstract:
- Highlights: Machine learning could reduce the delay in calculating reproduction number (Rt). Mobility and environmental conditions correlate with Rt of SARS-CoV-2 values. Gradient boosting technique proposed solid predictions of SARS-CoV-2 diffusion. Abstract: Introduction: SARS-CoV-2 was declared a pandemic by the WHO on March 11th, 2020. Public protective measures were enforced in every country to limit the diffusion of SARS-CoV-2. Its transmission, mainly by droplets, has been measured by the effective reproduction number (Rt) that counts the number of secondary cases caused in a population by an average infectious individual at time t. Current strategies to calculate Rt reflect the number of secondary cases after several days, due to a delay from symptoms onset to reporting. We propose a complementary Rt estimation using supervised machine learning techniques to predict short term variations with more timely results. Material and methods: Our primary goal was to predict Rt of the current day in the twelve provinces of Lombardy with the highest possible accuracy, and with no influence of the local testing strategies. We gathered data about mobility, weather, and pollution from different public sources as a proxy of human behavior and public health measures. We built four supervised machine learning algorithms with different strategies: the outcome variable was the daily median Rt values per province obtained from officially adopted algorithms. Results: Data from 243 daysHighlights: Machine learning could reduce the delay in calculating reproduction number (Rt). Mobility and environmental conditions correlate with Rt of SARS-CoV-2 values. Gradient boosting technique proposed solid predictions of SARS-CoV-2 diffusion. Abstract: Introduction: SARS-CoV-2 was declared a pandemic by the WHO on March 11th, 2020. Public protective measures were enforced in every country to limit the diffusion of SARS-CoV-2. Its transmission, mainly by droplets, has been measured by the effective reproduction number (Rt) that counts the number of secondary cases caused in a population by an average infectious individual at time t. Current strategies to calculate Rt reflect the number of secondary cases after several days, due to a delay from symptoms onset to reporting. We propose a complementary Rt estimation using supervised machine learning techniques to predict short term variations with more timely results. Material and methods: Our primary goal was to predict Rt of the current day in the twelve provinces of Lombardy with the highest possible accuracy, and with no influence of the local testing strategies. We gathered data about mobility, weather, and pollution from different public sources as a proxy of human behavior and public health measures. We built four supervised machine learning algorithms with different strategies: the outcome variable was the daily median Rt values per province obtained from officially adopted algorithms. Results: Data from 243 days for every province were presented to our four models (from February 15th, 2020, to October 14th, 2020). Two models using differential calculation of Rt instead of the raw values showed the highest mean coefficient of determination (0.93 for both) and residuals reported the lowest mean error (-0.03 and 0.01) and standard deviation (0.13 for both) as well. The one with access to the value of Rt of the day before heavily relied on that feature for prediction, while the other one had more distributed weights. Discussion: The model that had not access to the Rt value of the previous day and used Rt differential value as outcome (FDRt) was considered the most robust according to the metrics. Its forecasts were able to predict the trend that Rt values would have developed over different weeks, but it was not particularly accurate in predicting the precise value of Rt. A correlation among mobility, atmospheric, features, pollution and Rt values is plausible, but further testing should be performed. … (more)
- Is Part Of:
- International journal of medical informatics. Volume 162(2022)
- Journal:
- International journal of medical informatics
- Issue:
- Volume 162(2022)
- Issue Display:
- Volume 162, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 162
- Issue:
- 2022
- Issue Sort Value:
- 2022-0162-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-06
- Subjects:
- COVID-19 -- Machine learning -- Rt prediction -- Epidemiology -- Mobility data -- Environmental data -- Data science
Medical informatics -- Periodicals
Information science -- Periodicals
Computers -- Periodicals
Medical technology -- Periodicals
Medical Informatics -- Periodicals
Technology, Medical -- Periodicals
Computers
Information science
Medical informatics
Medical technology
Electronic journals
Periodicals
Electronic journals
610.285 - Journal URLs:
- http://www.sciencedirect.com/science/journal/13865056 ↗
http://www.clinicalkey.com/dura/browse/journalIssue/13865056 ↗
http://www.clinicalkey.com.au/dura/browse/journalIssue/13865056 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.ijmedinf.2022.104755 ↗
- Languages:
- English
- ISSNs:
- 1386-5056
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4542.345250
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 22188.xml