Time series cluster kernels to exploit informative missingness and incomplete label information. (July 2021)
- Record Type:
- Journal Article
- Title:
- Time series cluster kernels to exploit informative missingness and incomplete label information. (July 2021)
- Main Title:
- Time series cluster kernels to exploit informative missingness and incomplete label information
- Authors:
- Øyvind Mikalsen, Karl
Soguero-Ruiz, Cristina
Maria Bianchi, Filippo
Revhaug, Arthur
Jenssen, Robert - Abstract:
- Highlights: Two novel kernels for multivariate time series with missing data are proposed. Informative missingness is exploited using mixed mode Bayesian mixture models. We exploit incomplete label information using ideas from information theory. A case study of patients suffering from infectious postoperative complications. Abstract: The time series cluster kernel (TCK) provides a powerful tool for analysing multivariate time series subject to missing data. TCK is designed using an ensemble learning approach in which Bayesian mixture models form the base models. Because of the Bayesian approach, TCK can naturally deal with missing values without resorting to imputation and the ensemble strategy ensures robustness to hyperparameters, making it particularly well suited for unsupervised learning. However, TCK assumes missing at random and that the underlying missingness mechanism is ignorable, i.e. uninformative, an assumption that does not hold in many real-world applications, such as e.g. medicine. To overcome this limitation, we present a kernel capable of exploiting the potentially rich information in the missing values and patterns, as well as the information from the observed data. In our approach, we create a representation of the missing pattern, which is incorporated into mixed mode mixture models in such a way that the information provided by the missing patterns is effectively exploited. Moreover, we also propose a semi-supervised kernel, capable of taking advantageHighlights: Two novel kernels for multivariate time series with missing data are proposed. Informative missingness is exploited using mixed mode Bayesian mixture models. We exploit incomplete label information using ideas from information theory. A case study of patients suffering from infectious postoperative complications. Abstract: The time series cluster kernel (TCK) provides a powerful tool for analysing multivariate time series subject to missing data. TCK is designed using an ensemble learning approach in which Bayesian mixture models form the base models. Because of the Bayesian approach, TCK can naturally deal with missing values without resorting to imputation and the ensemble strategy ensures robustness to hyperparameters, making it particularly well suited for unsupervised learning. However, TCK assumes missing at random and that the underlying missingness mechanism is ignorable, i.e. uninformative, an assumption that does not hold in many real-world applications, such as e.g. medicine. To overcome this limitation, we present a kernel capable of exploiting the potentially rich information in the missing values and patterns, as well as the information from the observed data. In our approach, we create a representation of the missing pattern, which is incorporated into mixed mode mixture models in such a way that the information provided by the missing patterns is effectively exploited. Moreover, we also propose a semi-supervised kernel, capable of taking advantage of incomplete label information to learn more accurate similarities. Experiments on benchmark data, as well as a real-world case study of patients described by longitudinal electronic health record data who potentially suffer from hospital-acquired infections, demonstrate the effectiveness of the proposed methods. … (more)
- Is Part Of:
- Pattern recognition. Volume 115(2021)
- Journal:
- Pattern recognition
- Issue:
- Volume 115(2021)
- Issue Display:
- Volume 115, Issue 2021 (2021)
- Year:
- 2021
- Volume:
- 115
- Issue:
- 2021
- Issue Sort Value:
- 2021-0115-2021-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-07
- Subjects:
- Multivariate time series -- Kernel methods -- Missing data -- Informative missingness -- Semi-supervised learning
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2021.107896 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 16278.xml