Improving accuracy of air pollution exposure measurements: Statistical correction of a municipal low-cost airborne particulate matter sensor network. (1st January 2021)
- Record Type:
- Journal Article
- Title:
- Improving accuracy of air pollution exposure measurements: Statistical correction of a municipal low-cost airborne particulate matter sensor network. (1st January 2021)
- Main Title:
- Improving accuracy of air pollution exposure measurements: Statistical correction of a municipal low-cost airborne particulate matter sensor network
- Authors:
- Considine, Ellen M.
Reid, Colleen E.
Ogletree, Michael R.
Dye, Timothy - Abstract:
- Abstract: Low-cost air quality sensors can help increase spatial and temporal resolution of air pollution exposure measurements. These sensors, however, most often produce data of lower accuracy than higher-end instruments. In this study, we investigated linear and random forest models to correct PM2.5 measurements from the Denver Department of Public Health and Environment (DDPHE)'s network of low-cost sensors against measurements from co-located U.S. Environmental Protection Agency Federal Equivalence Method (FEM) monitors. Our training set included data from five DDPHE sensors from August 2018 through May 2019. Our testing set included data from two newly deployed DDPHE sensors from September 2019 through mid-December 2019. In addition to PM2.5, temperature, and relative humidity from the low-cost sensors, we explored using additional temporal and spatial variables to capture unexplained variability in sensor measurements. We evaluated results using spatial and temporal cross-validation techniques. For the long-term dataset, a random forest model with all time-varying covariates and length of arterial roads within 500 m was the most accurate (testing RMSE = 2.9 μg/m 3 and R 2 = 0.75; leave-one-location-out (LOLO)-validation metrics on the training set: RMSE = 2.2 μg/m 3 and R 2 = 0.93). For on-the-fly correction, we found that a multiple linear regression model using the past eight weeks of low-cost sensor PM2.5, temperature, and humidity data plus a near-highwayAbstract: Low-cost air quality sensors can help increase spatial and temporal resolution of air pollution exposure measurements. These sensors, however, most often produce data of lower accuracy than higher-end instruments. In this study, we investigated linear and random forest models to correct PM2.5 measurements from the Denver Department of Public Health and Environment (DDPHE)'s network of low-cost sensors against measurements from co-located U.S. Environmental Protection Agency Federal Equivalence Method (FEM) monitors. Our training set included data from five DDPHE sensors from August 2018 through May 2019. Our testing set included data from two newly deployed DDPHE sensors from September 2019 through mid-December 2019. In addition to PM2.5, temperature, and relative humidity from the low-cost sensors, we explored using additional temporal and spatial variables to capture unexplained variability in sensor measurements. We evaluated results using spatial and temporal cross-validation techniques. For the long-term dataset, a random forest model with all time-varying covariates and length of arterial roads within 500 m was the most accurate (testing RMSE = 2.9 μg/m 3 and R 2 = 0.75; leave-one-location-out (LOLO)-validation metrics on the training set: RMSE = 2.2 μg/m 3 and R 2 = 0.93). For on-the-fly correction, we found that a multiple linear regression model using the past eight weeks of low-cost sensor PM2.5, temperature, and humidity data plus a near-highway indicator predicted each new week of data best (testing RMSE = 3.1 μg/m 3 and R 2 = 0.78; LOLO-validation metrics on the training set: RMSE = 2.3 μg/m 3 and R 2 = 0.90). The statistical methods detailed here will be used to correct low-cost sensor measurements to better understand PM2.5 pollution within the city of Denver. This work can also guide similar implementations in other municipalities by highlighting the improved accuracy from inclusion of variables other than temperature and relative humidity to improve accuracy of low-cost sensor PM2.5 data. Graphical abstract: Image 1 Highlights: Random forest regression worked best for calibrating/correcting archived PM2.5 sensor data. Timespan-optimized linear regression worked best for correcting data on-the-fly. Road/traffic variables improved both types of calibration/correction. Spatial cross-validation techniques help evaluate models. Our models will be used for real-time air pollution communication in Denver. … (more)
- Is Part Of:
- Environmental pollution. Volume 268(2021)Part B
- Journal:
- Environmental pollution
- Issue:
- Volume 268(2021)Part B
- Issue Display:
- Volume 268, Issue 2021 (2021)
- Year:
- 2021
- Volume:
- 268
- Issue:
- 2021
- Issue Sort Value:
- 2021-0268-2021-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-01-01
- Subjects:
- Air pollution -- Low-cost sensor -- Machine learning -- On-the-fly calibration -- Plantower sensor -- Cross-validation
Pollution -- Periodicals
Pollution -- Environmental aspects -- Periodicals
Environmental Pollution -- Periodicals
Pollution -- Périodiques
Pollution -- Aspect de l'environnement -- Périodiques
Pollution -- Effets physiologiques -- Périodiques
Pollution
Pollution -- Environmental aspects
Periodicals
Electronic journals
363.73 - Journal URLs:
- http://www.sciencedirect.com/science/journal/02697491 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.envpol.2020.115833 ↗
- Languages:
- English
- ISSNs:
- 0269-7491
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3791.539000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 15190.xml