Mining electronic health records to identify predictive factors associated with hospital admission for Campylobacter infections. (November 2017)
- Record Type:
- Journal Article
- Title:
- Mining electronic health records to identify predictive factors associated with hospital admission for Campylobacter infections. (November 2017)
- Main Title:
- Mining electronic health records to identify predictive factors associated with hospital admission for Campylobacter infections
- Authors:
- Zhou, Shang-Ming
Muhammad, Rahman A
Sheppard, Samuel
Howe, Robin
Lyons, Ronan A
Brophy, Sinead - Abstract:
- Abstract: Background: Campylobacter infection is one of the most common bacterial infections in human beings. Most cases of Campylobacter infection are not well explained by commonly recognised risk factors. Data-driven clinical rules offer patients an unprecedented measure of control over their own health information. This study sought to identify influential predictors from general practice (GP) data to predict the outcomes of Campylobacter infections—ie, being admitted to hospital or remaining with GP care. Methods: Diagnosis of Campylobacter infection was collected from GP and hospital data in Wales between Jan 1, 1990, and Dec 31, 2015, via the Secure Anonymised Information Linkage databank. A specially designed feature selection scheme was used to select the most influential factors from a large number of initial variables formed by GP read codes (a hierarchical thesaurus of clinical terms) and from other codes (sex, age, location, and Townsend deprivation scores). Information retrieval and decision tree techniques were used to construct predictive models. The linked data were split into training, validation, and testing subsets. The testing dataset was used to assess the predictive performance of the identified risk factors. Findings: Health records of 20 376 patients with Campylobacter infections were used, of whom 12 324 were admitted to hospital at least once. Higher rates of infections occurred in male patients (10 793/20 376), but female patients had higher ratesAbstract: Background: Campylobacter infection is one of the most common bacterial infections in human beings. Most cases of Campylobacter infection are not well explained by commonly recognised risk factors. Data-driven clinical rules offer patients an unprecedented measure of control over their own health information. This study sought to identify influential predictors from general practice (GP) data to predict the outcomes of Campylobacter infections—ie, being admitted to hospital or remaining with GP care. Methods: Diagnosis of Campylobacter infection was collected from GP and hospital data in Wales between Jan 1, 1990, and Dec 31, 2015, via the Secure Anonymised Information Linkage databank. A specially designed feature selection scheme was used to select the most influential factors from a large number of initial variables formed by GP read codes (a hierarchical thesaurus of clinical terms) and from other codes (sex, age, location, and Townsend deprivation scores). Information retrieval and decision tree techniques were used to construct predictive models. The linked data were split into training, validation, and testing subsets. The testing dataset was used to assess the predictive performance of the identified risk factors. Findings: Health records of 20 376 patients with Campylobacter infections were used, of whom 12 324 were admitted to hospital at least once. Higher rates of infections occurred in male patients (10 793/20 376), but female patients had higher rates of being admitted to hospital (6207 of 9583 admissions). From an initial 33 471 variables, 31 most predictive variables were finally identified and aggregated, such as "second pneumococcal conjugated vaccination", and "stool culture. 30 if–then clinical rules with these 31 variables were generated by decision tree (eg, if patients had "second pneumococcal conjugated vaccination", then "not admitted"). These rules were applied to an independent test dataset and performed significantly above chance to predict the outcomes (sensitivity 0·642, 95% CI 0·638–0·644; specificity 0·62, 0·616–0·622; positive prediction value 0·724, 0·72–0·73). Interpretation: Our findings suggest that construction of machine learning models from routine GP data enables the influential risk factors identified to predict outcomes of Campylobacter infections significantly above chance. To address some unexplained model variations, incorporation of more information about the infection and symptoms (strain, severity) is required. Funding: National Centre for Population Health and Wellbeing Research (CA02), Centre for Improvement in Population Health through E-records Research (CIPHER) within the Farr Institute of Health Informatics Research (MR/K006525/1). … (more)
- Is Part Of:
- Lancet. Volume 390(2017)Supplement 3
- Journal:
- Lancet
- Issue:
- Volume 390(2017)Supplement 3
- Issue Display:
- Volume 390, Issue 3 (2017)
- Year:
- 2017
- Volume:
- 390
- Issue:
- 3
- Issue Sort Value:
- 2017-0390-0003-0000
- Page Start:
- S99
- Page End:
- Publication Date:
- 2017-11
- Subjects:
- Medicine -- Periodicals
Medicine -- Periodicals
Medicine
Medicine
Electronic journals
Periodicals
610.5 - Journal URLs:
- http://www.thelancet.com/ ↗
http://www.sciencedirect.com/science/journal/01406736 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/S0140-6736(17)33034-9 ↗
- Languages:
- English
- ISSNs:
- 0140-6736
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 5146.000000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 5388.xml