Tensor learning of pointwise mutual information from EHR data for early prediction of sepsis. (July 2021)
- Record Type:
- Journal Article
- Title:
- Tensor learning of pointwise mutual information from EHR data for early prediction of sepsis. (July 2021)
- Main Title:
- Tensor learning of pointwise mutual information from EHR data for early prediction of sepsis
- Authors:
- Nesaragi, Naimahmed
Patidar, Shivnarayan
Aggarwal, Vaneet - Abstract:
- Abstract: Early detection of sepsis can facilitate early clinical intervention with effective treatment and may reduce sepsis mortality rates. In view of this, machine learning-based automated diagnosis of sepsis using easily recordable physiological data can be more promising as compared to the gold standard rule-based clinical criteria in current practice. This study aims to develop such a machine learning framework that demonstrates the quantification of heterogeneity within the tabular electronic health records (EHR) data of clinical covariates to capture both linear relationships and nonlinear correlation for the early prediction of sepsis. Here, the statistics of pairwise association for each hour-covariate pair within the EHR data for every 6-hours window-duration with selected 24 covariates is described using pointwise mutual information (PMI) matrix. This matrix gives the heterogeneity of data as a two-dimensional map. Such matrices are fused horizontally along the z -axis as vertical slices in the x y plane to form a 3-way tensor for each record with the corresponding Length of Stay ( L ). Tensor factorization of such fused tensor for every record is performed using Tucker decomposition, and only the core tensors are retained later, excluding the 3 unitary matrices to provide the latent feature set for the prediction of sepsis onset. A five-fold cross-validation scheme is employed wherein the obtained 120 latent features from the reshaped core tensor, are fed toAbstract: Early detection of sepsis can facilitate early clinical intervention with effective treatment and may reduce sepsis mortality rates. In view of this, machine learning-based automated diagnosis of sepsis using easily recordable physiological data can be more promising as compared to the gold standard rule-based clinical criteria in current practice. This study aims to develop such a machine learning framework that demonstrates the quantification of heterogeneity within the tabular electronic health records (EHR) data of clinical covariates to capture both linear relationships and nonlinear correlation for the early prediction of sepsis. Here, the statistics of pairwise association for each hour-covariate pair within the EHR data for every 6-hours window-duration with selected 24 covariates is described using pointwise mutual information (PMI) matrix. This matrix gives the heterogeneity of data as a two-dimensional map. Such matrices are fused horizontally along the z -axis as vertical slices in the x y plane to form a 3-way tensor for each record with the corresponding Length of Stay ( L ). Tensor factorization of such fused tensor for every record is performed using Tucker decomposition, and only the core tensors are retained later, excluding the 3 unitary matrices to provide the latent feature set for the prediction of sepsis onset. A five-fold cross-validation scheme is employed wherein the obtained 120 latent features from the reshaped core tensor, are fed to Light Gradient Boosting Machine Learning models (LightGBM) for binary classification, further alleviating the involved class imbalance. The machine-learning framework is designed via Bayesian optimization, yielding an average normalized utility score of 0.4519 as defined by challenge organizers and area under the receiver operating characteristic curve (AUROC) of 0.8621 on publicly available PhysioNet/Computing in Cardiology Challenge 2019 training data. The proposed tensor decomposition of 3-way fused tensor formulated using PMI matrices leverages higher-order temporal interactions between the pairwise associations among the clinical values for early prediction of sepsis. This is validated with improved risk prediction power for every hour of admission to the ICU in terms of utility score, AUROC, and F1 score. The results obtained show a significant improvement particularly in terms of utility score of ~ 1.5 − 2 % under a 5-fold cross-validation scheme on entire training data as compared to a top entrant research study that participated in the challenge. Graphical abstract: Image 1 Highlights: This work applies mutual information tensor analysis on EHR data for sepsis detection. Tensor factorization deciphers higher-order latent information serving as embeddings. PMI uncovers temporal interactions by capturing linear and non-linear dependencies. The study excludes Length of Stay and other demographics that may bias the algorithms. Substantial improvement of ~5% in overall utility score as compared to previous work. … (more)
- Is Part Of:
- Computers in biology and medicine. Volume 134(2021)
- Journal:
- Computers in biology and medicine
- Issue:
- Volume 134(2021)
- Issue Display:
- Volume 134, Issue 2021 (2021)
- Year:
- 2021
- Volume:
- 134
- Issue:
- 2021
- Issue Sort Value:
- 2021-0134-2021-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-07
- Subjects:
- Sepsis -- Machine learning -- Tensor factorization -- Pointwise mutual information -- Early prediction -- Model-based diagnosis -- Electronic health records -- Medical informatics
Medicine -- Data processing -- Periodicals
Biology -- Data processing -- Periodicals
610.285 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00104825/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compbiomed.2021.104430 ↗
- Languages:
- English
- ISSNs:
- 0010-4825
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.880000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 17458.xml