Transfer learning for long-interval consecutive missing values imputation without external features in air pollution time series. (April 2020)
- Record Type:
- Journal Article
- Title:
- Transfer learning for long-interval consecutive missing values imputation without external features in air pollution time series. (April 2020)
- Main Title:
- Transfer learning for long-interval consecutive missing values imputation without external features in air pollution time series
- Authors:
- Ma, Jun
Cheng, Jack C.P.
Ding, Yuexiong
Lin, Changqing
Jiang, Feifeng
Wang, Mingzhu
Zhai, Chong - Abstract:
- Highlights: A new transferred LSTM-based iterative estimation model is proposed. Long-interval consecutive missing air pollution values are filled. The model shows good performance on different missing rates and missing positions. Compared with past methods, the model has 25–50% higher imputation accuracy. Abstract: Air pollution has become one of the world's largest health and environmental problems. Studies focusing on air quality prediction, influential factors analysis, and control policy evaluation are increasing. When conducting these studies, valid and high-quality air pollution data are necessarily required to generate reasonable results. Missing data, which is frequently contained in the collected raw data, therefore, has become a significant barrier. Existing methods on missing data either cannot effectively capture the temporal and spatial mechanism of air pollution or focus on sequences with low missing rates and random missing positions. To address this problem, this paper proposes a new imputation methodology, namely transferred long short-term memory-based iterative estimation (TLSTM-IE) to impute consecutive missing values with large missing rates. A case study is conducted in New York City to verify the effectiveness and priority of the proposed methodology. Long-interval consecutive missing PM2.5 concentration data are filled. Experimental results show that the proposed model can effectively learn from long-term dependencies and transfer the learnedHighlights: A new transferred LSTM-based iterative estimation model is proposed. Long-interval consecutive missing air pollution values are filled. The model shows good performance on different missing rates and missing positions. Compared with past methods, the model has 25–50% higher imputation accuracy. Abstract: Air pollution has become one of the world's largest health and environmental problems. Studies focusing on air quality prediction, influential factors analysis, and control policy evaluation are increasing. When conducting these studies, valid and high-quality air pollution data are necessarily required to generate reasonable results. Missing data, which is frequently contained in the collected raw data, therefore, has become a significant barrier. Existing methods on missing data either cannot effectively capture the temporal and spatial mechanism of air pollution or focus on sequences with low missing rates and random missing positions. To address this problem, this paper proposes a new imputation methodology, namely transferred long short-term memory-based iterative estimation (TLSTM-IE) to impute consecutive missing values with large missing rates. A case study is conducted in New York City to verify the effectiveness and priority of the proposed methodology. Long-interval consecutive missing PM2.5 concentration data are filled. Experimental results show that the proposed model can effectively learn from long-term dependencies and transfer the learned knowledge. The imputation accuracy of the TLSTM-IE model is 25–50% higher than other commonly seen methods. The novelty of this study lies in two aspects. First is that we target at long-interval consecutive missing data, which has not been addressed before by existing studies in atmospheric research. Second is the novel application of transfer learning on missing values imputation. To our best knowledge, no research on air quality has implemented this technique on this problem before. … (more)
- Is Part Of:
- Advanced engineering informatics. Volume 44(2020)
- Journal:
- Advanced engineering informatics
- Issue:
- Volume 44(2020)
- Issue Display:
- Volume 44, Issue 2020 (2020)
- Year:
- 2020
- Volume:
- 44
- Issue:
- 2020
- Issue Sort Value:
- 2020-0044-2020-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-04
- Subjects:
- Air quality -- Deep learning -- Long-interval consecutive missing values -- Long short-term memory (LSTM) -- Neural network -- Transfer learning
Computer-aided engineering -- Periodicals
Engineering -- Data processing -- Periodicals
620.00285 - Journal URLs:
- http://www.sciencedirect.com/science/journal/14740346 ↗
http://books.google.com/books?id=KhFVAAAAMAAJ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.aei.2020.101092 ↗
- Languages:
- English
- ISSNs:
- 1474-0346
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 0696.851100
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 13459.xml