Evaluation of white-box versus black-box machine learning models in estimating ambient black carbon concentration. (February 2021)
- Record Type:
- Journal Article
- Title:
- Evaluation of white-box versus black-box machine learning models in estimating ambient black carbon concentration. (February 2021)
- Main Title:
- Evaluation of white-box versus black-box machine learning models in estimating ambient black carbon concentration
- Authors:
- Fung, Pak L.
Zaidan, Martha A.
Timonen, Hilkka
Niemi, Jarkko V.
Kousa, Anu
Kuula, Joel
Luoma, Krista
Tarkoma, Sasu
Petäjä, Tuukka
Kulmala, Markku
Hussein, Tareq - Abstract:
- Abstract: Air quality prediction with black-box (BB) modelling is gaining widespread interest in research and industry. This type of data-driven models work generally better in terms of accuracy but are limited to capture physical, chemical and meteorological processes and therefore accountability for interpretation. In this paper, we evaluated different white-box (WB) and BB methods that estimate atmospheric black carbon (BC) concentration by a suite of observations from the same measurement site. This study involves data in the period of 1 st January 2017–31 st December 2018 from two measurement sites, from a street canyon site in Mäkelänkatu and from an urban background site in Kumpula, in Helsinki, Finland. At the street canyon site, WB models performed ( R 2 = 0.81–0.87) in a similar way as the BB models did ( R 2 = 0.86–0.87). The overall performance of the BC concentration estimation methods at the urban background site was much worse probably because of a combination of smaller dynamic variability in the BC values and longer data gaps. However, the difference in WB ( R 2 = 0.44–0.60) and BB models ( R 2 = 0.41–0.64) was not significant. Furthermore, the WB models are closer to physics-based models, and it is easier to spot the relative importance of the predictor variable and determine if the model output makes sense. This feature outweighs slightly higher performance of some individual BB models, and inherently the WB models are a better choice due to theirAbstract: Air quality prediction with black-box (BB) modelling is gaining widespread interest in research and industry. This type of data-driven models work generally better in terms of accuracy but are limited to capture physical, chemical and meteorological processes and therefore accountability for interpretation. In this paper, we evaluated different white-box (WB) and BB methods that estimate atmospheric black carbon (BC) concentration by a suite of observations from the same measurement site. This study involves data in the period of 1 st January 2017–31 st December 2018 from two measurement sites, from a street canyon site in Mäkelänkatu and from an urban background site in Kumpula, in Helsinki, Finland. At the street canyon site, WB models performed ( R 2 = 0.81–0.87) in a similar way as the BB models did ( R 2 = 0.86–0.87). The overall performance of the BC concentration estimation methods at the urban background site was much worse probably because of a combination of smaller dynamic variability in the BC values and longer data gaps. However, the difference in WB ( R 2 = 0.44–0.60) and BB models ( R 2 = 0.41–0.64) was not significant. Furthermore, the WB models are closer to physics-based models, and it is easier to spot the relative importance of the predictor variable and determine if the model output makes sense. This feature outweighs slightly higher performance of some individual BB models, and inherently the WB models are a better choice due to their transparency in the model architecture. Among all the WB models, IAP and LASSO are recommended due to its flexibility and its efficiency, respectively. Our findings also ascertain the importance of temporal properties in statistical modelling. In the future, the developed BC estimation model could serve as a virtual sensor and complement the current air quality monitoring. Main findings: White-box models are preferred over black-box models in estimating black carbon because they are closer to physics-based models, and it is easier to spot the relative importance of the predictor variable. The black carbon model could serve as a virtual sensor integrating into air quality network in support with real measurements, so as to complement the current air quality index. Graphical abstract: Image 1 Highlights: Reliable black carbon models give additional insights to air quality index advance. Models at street canyon show better performance than that at urban background. No significant difference in white-box and black-box models in estimating BC. White-box models are preferred due to their transparency in the model architecture. The developed white-box BC estimation model could serve as a virtual sensor. … (more)
- Is Part Of:
- Journal of aerosol science. Volume 152(2021)
- Journal:
- Journal of aerosol science
- Issue:
- Volume 152(2021)
- Issue Display:
- Volume 152, Issue 2021 (2021)
- Year:
- 2021
- Volume:
- 152
- Issue:
- 2021
- Issue Sort Value:
- 2021-0152-2021-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-02
- Subjects:
- Air quality -- Statistical modelling -- Random forest -- Neural network -- Support vector -- Input-adaptive proxy
Aerosols -- Periodicals
Aerosols -- Periodicals
Aérosols -- Périodiques
541.34515 - Journal URLs:
- http://www.journals.elsevier.com/journal-of-aerosol-science/ ↗
http://www.sciencedirect.com/science/journal/00218502 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.jaerosci.2020.105694 ↗
- Languages:
- English
- ISSNs:
- 0021-8502
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4919.060000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 15322.xml