Concurrent spatiotemporal daily land use regression modeling and missing data imputation of fine particulate matter using distributed space-time expectation maximization. (1st March 2020)
- Record Type:
- Journal Article
- Title:
- Concurrent spatiotemporal daily land use regression modeling and missing data imputation of fine particulate matter using distributed space-time expectation maximization. (1st March 2020)
- Main Title:
- Concurrent spatiotemporal daily land use regression modeling and missing data imputation of fine particulate matter using distributed space-time expectation maximization
- Authors:
- Taghavi-Shahri, Seyed Mahmood
Fassò, Alessandro
Mahaki, Behzad
Amini, Heresh - Abstract:
- Abstract: In this study, a spatiotemporal land use regression (LUR) model using distributed space-time expectation maximization (D-STEM) software was developed. We trained the model using daily mean ambient particulate matter ≤2.5 μm (PM2.5 ) data measured hourly in 2015 at 30 regulatory monitoring network stations within the megacity of Tehran, Iran. Since a substantial amount of measured data were missing (48% of the total number of daily PM2.5 observations), we used the D-STEM to impute missing data and compared the missing imputation performance between different fitted models and the mean substitution method. We used h-block cross-validation (h-block CV) method in order to account for spatial autocorrelation in the model building and validation. In the imputation of missing data, the D-STEM LUR model had a mean absolute percentage error (MAPE) of 25.3%, outperforming the mean substitution method, which resulted in MAPE of 28.3%. The spatiotemporal R-squared was 0.73 and the average CV R-squared of 2-block and 5-block cross-validations was 0.60. These values were 0.68 and 0.47 when the spatial aspect of the LUR model was assessed, and 0.995 and 0.992 when the temporal aspect of the LUR model was assessed. This study demonstrated the competence of D-STEM software in spatiotemporal modeling, missing data imputation, and mapping of daily ambient PM2.5 at a very high spatial resolution (20 m × 20 m). These estimations are available for future research, especially forAbstract: In this study, a spatiotemporal land use regression (LUR) model using distributed space-time expectation maximization (D-STEM) software was developed. We trained the model using daily mean ambient particulate matter ≤2.5 μm (PM2.5 ) data measured hourly in 2015 at 30 regulatory monitoring network stations within the megacity of Tehran, Iran. Since a substantial amount of measured data were missing (48% of the total number of daily PM2.5 observations), we used the D-STEM to impute missing data and compared the missing imputation performance between different fitted models and the mean substitution method. We used h-block cross-validation (h-block CV) method in order to account for spatial autocorrelation in the model building and validation. In the imputation of missing data, the D-STEM LUR model had a mean absolute percentage error (MAPE) of 25.3%, outperforming the mean substitution method, which resulted in MAPE of 28.3%. The spatiotemporal R-squared was 0.73 and the average CV R-squared of 2-block and 5-block cross-validations was 0.60. These values were 0.68 and 0.47 when the spatial aspect of the LUR model was assessed, and 0.995 and 0.992 when the temporal aspect of the LUR model was assessed. This study demonstrated the competence of D-STEM software in spatiotemporal modeling, missing data imputation, and mapping of daily ambient PM2.5 at a very high spatial resolution (20 m × 20 m). These estimations are available for future research, especially for epidemiological studies on short- and/or long-term health effects of ambient PM2.5 . Generally, we found D-STEM as a promising tool for spatiotemporal LUR modeling of ambient air pollution, especially for those models that rely on regulatory network monitoring stations with a considerable amount of missing data. Highlights: Developed daily land use regression models in Tehran, Iran using D-STEM. Assessed D-STEM capabilities in modeling, mapping, and missing data imputation. H-block CV was used for model validation that adjusts for spatial autocorrelations. Mapped daily PM2.5 exposures in an area of ~613 km 2 at 20 m spatial resolution. Daily predicted PM2.5 was higher than the WHO 24-h guideline in 70% of all days. … (more)
- Is Part Of:
- Atmospheric environment. Volume 224(2020)
- Journal:
- Atmospheric environment
- Issue:
- Volume 224(2020)
- Issue Display:
- Volume 224, Issue 2020 (2020)
- Year:
- 2020
- Volume:
- 224
- Issue:
- 2020
- Issue Sort Value:
- 2020-0224-2020-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-03-01
- Subjects:
- Air pollution -- Exposure assessment -- LUR -- Missing data -- D-STEM
Air -- Pollution -- Periodicals
Air -- Pollution -- Meteorological aspects -- Periodicals
551.51 - Journal URLs:
- http://www.sciencedirect.com/web-editions/journal/13522310 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.atmosenv.2019.117202 ↗
- Languages:
- English
- ISSNs:
- 1352-2310
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 1767.120000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 13454.xml