An ensemble-based model of PM2.5 concentration across the contiguous United States with high spatiotemporal resolution. (September 2019)
- Record Type:
- Journal Article
- Title:
- An ensemble-based model of PM2.5 concentration across the contiguous United States with high spatiotemporal resolution. (September 2019)
- Main Title:
- An ensemble-based model of PM2.5 concentration across the contiguous United States with high spatiotemporal resolution
- Authors:
- Di, Qian
Amini, Heresh
Shi, Liuhua
Kloog, Itai
Silvern, Rachel
Kelly, James
Sabath, M. Benjamin
Choirat, Christine
Koutrakis, Petros
Lyapustin, Alexei
Wang, Yujie
Mickley, Loretta J.
Schwartz, Joel - Abstract:
- Abstract: Various approaches have been proposed to model PM2.5 in the recent decade, with satellite-derived aerosol optical depth, land-use variables, chemical transport model predictions, and several meteorological variables as major predictor variables. Our study used an ensemble model that integrated multiple machine learning algorithms and predictor variables to estimate daily PM2.5 at a resolution of 1 km × 1 km across the contiguous United States. We used a generalized additive model that accounted for geographic difference to combine PM2.5 estimates from neural network, random forest, and gradient boosting. The three machine learning algorithms were based on multiple predictor variables, including satellite data, meteorological variables, land-use variables, elevation, chemical transport model predictions, several reanalysis datasets, and others. The model training results from 2000 to 2015 indicated good model performance with a 10-fold cross-validated R 2 of 0.86 for daily PM2.5 predictions. For annual PM2.5 estimates, the cross-validated R 2 was 0.89. Our model demonstrated good performance up to 60 μg/m 3 . Using trained PM2.5 model and predictor variables, we predicted daily PM2.5 from 2000 to 2015 at every 1 km × 1 km grid cell in the contiguous United States. We also used localized land-use variables within 1 km × 1 km grids to downscale PM2.5 predictions to 100 m × 100 m grid cells. To characterize uncertainty, we used meteorological variables, land-useAbstract: Various approaches have been proposed to model PM2.5 in the recent decade, with satellite-derived aerosol optical depth, land-use variables, chemical transport model predictions, and several meteorological variables as major predictor variables. Our study used an ensemble model that integrated multiple machine learning algorithms and predictor variables to estimate daily PM2.5 at a resolution of 1 km × 1 km across the contiguous United States. We used a generalized additive model that accounted for geographic difference to combine PM2.5 estimates from neural network, random forest, and gradient boosting. The three machine learning algorithms were based on multiple predictor variables, including satellite data, meteorological variables, land-use variables, elevation, chemical transport model predictions, several reanalysis datasets, and others. The model training results from 2000 to 2015 indicated good model performance with a 10-fold cross-validated R 2 of 0.86 for daily PM2.5 predictions. For annual PM2.5 estimates, the cross-validated R 2 was 0.89. Our model demonstrated good performance up to 60 μg/m 3 . Using trained PM2.5 model and predictor variables, we predicted daily PM2.5 from 2000 to 2015 at every 1 km × 1 km grid cell in the contiguous United States. We also used localized land-use variables within 1 km × 1 km grids to downscale PM2.5 predictions to 100 m × 100 m grid cells. To characterize uncertainty, we used meteorological variables, land-use variables, and elevation to model the monthly standard deviation of the difference between daily monitored and predicted PM2.5 for every 1 km × 1 km grid cell. This PM2.5 prediction dataset, including the downscaled and uncertainty predictions, allows epidemiologists to accurately estimate the adverse health effect of PM2.5 . Compared with model performance of individual base learners, an ensemble model would achieve a better overall estimation. It is worth exploring other ensemble model formats to synthesize estimations from different models or from different groups to improve overall performance. Graphical abstract: Unlabelled Image Highlights: An ensemble model integrates three machine learning algorithms and estimates PM2.5 . Satellite measurements, land-use terms, and many variables were predictors. Model predicts daily PM2.5 at 1 km × 1 km grid cells in the entire United States. Model predictions were downscaled to 100 m × 100 m level. Monthly uncertainty level of prediction was also estimated. … (more)
- Is Part Of:
- Environment international. Volume 130(2019)
- Journal:
- Environment international
- Issue:
- Volume 130(2019)
- Issue Display:
- Volume 130, Issue 2019 (2019)
- Year:
- 2019
- Volume:
- 130
- Issue:
- 2019
- Issue Sort Value:
- 2019-0130-2019-0000
- Page Start:
- Page End:
- Publication Date:
- 2019-09
- Subjects:
- Fine particulate matter (PM2.5) -- Ensemble model -- Neural network -- Gradient boosting -- Random forest
Environmental protection -- Periodicals
Environmental health -- Periodicals
Environmental monitoring -- Periodicals
Environmental Monitoring -- Periodicals
Environnement -- Protection -- Périodiques
Hygiène du milieu -- Périodiques
Environnement -- Surveillance -- Périodiques
Environmental health
Environmental monitoring
Environmental protection
Periodicals
333.705 - Journal URLs:
- http://www.sciencedirect.com/science/journal/01604120 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.envint.2019.104909 ↗
- Languages:
- English
- ISSNs:
- 0160-4120
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3791.330000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 16293.xml