SemEHR: surfacing semantic data from clinical notes in electronic health records for tailored care, trial recruitment, and clinical research. (November 2017)
- Record Type:
- Journal Article
- Title:
- SemEHR: surfacing semantic data from clinical notes in electronic health records for tailored care, trial recruitment, and clinical research. (November 2017)
- Main Title:
- SemEHR: surfacing semantic data from clinical notes in electronic health records for tailored care, trial recruitment, and clinical research
- Authors:
- Wu, Honghan
Toti, Giulia
Morley, Katherine I
Ibrahim, Zina
Folarin, Amos
Kartoglu, Ismail
Jackson, Richard
Agrawal, Asha
Stringer, Clive
Gale, Darren
Gorrell, Genevieve M
Roberts, Angus
Broadbent, Matthew
Stewart, Robert
Dobson, Richard J B - Abstract:
- Abstract: Background: Deriving structured data from unstructured clinical notes in electronic health records (EHRs) requires natural language processing and clinical expertise, which is often costly, and frequently a one-off investment. We implemented SemEHR, a semantic search system that reduces the expertise and effort required in this context. We aimed to use it to characterise and select patients for projects such as the UK Department of Health 100, 000 Genome Project. Methods: Built upon the off-the-shelf toolkits, Bio-YODIE and CogStack, SemEHR integrates heterogeneous EHR documents and identifies contextualised (negation, temporality, and experiencer) mentions of a wide range of biomedical concepts including SNOMED CT, ICD-10, LOINC, and Drug Ontology. Text mining and semantics techniques are incorporated to derive a longitudinal patient panorama, combining structured profiles and unstructured records, available through semantic search interfaces. Findings: We deployed SemEHR in various UK hospital EHRs, including the South London and Maudsley NHS Foundation Trust, where 46 million concept mentions were identified from 18 million documents. In a liver disease study, SemEHR identified 94 of 100 hepatitis C positive manually annotated patients. In a HIV study, SemEHR identified 21 of 23 true positives in a 1000-patient cohort. At King's College Hospital, SemEHR is being used to recruit patients into the 100, 000 Genomes Project, where ontological associations areAbstract: Background: Deriving structured data from unstructured clinical notes in electronic health records (EHRs) requires natural language processing and clinical expertise, which is often costly, and frequently a one-off investment. We implemented SemEHR, a semantic search system that reduces the expertise and effort required in this context. We aimed to use it to characterise and select patients for projects such as the UK Department of Health 100, 000 Genome Project. Methods: Built upon the off-the-shelf toolkits, Bio-YODIE and CogStack, SemEHR integrates heterogeneous EHR documents and identifies contextualised (negation, temporality, and experiencer) mentions of a wide range of biomedical concepts including SNOMED CT, ICD-10, LOINC, and Drug Ontology. Text mining and semantics techniques are incorporated to derive a longitudinal patient panorama, combining structured profiles and unstructured records, available through semantic search interfaces. Findings: We deployed SemEHR in various UK hospital EHRs, including the South London and Maudsley NHS Foundation Trust, where 46 million concept mentions were identified from 18 million documents. In a liver disease study, SemEHR identified 94 of 100 hepatitis C positive manually annotated patients. In a HIV study, SemEHR identified 21 of 23 true positives in a 1000-patient cohort. At King's College Hospital, SemEHR is being used to recruit patients into the 100, 000 Genomes Project, where ontological associations are integrated to match recruitment criteria and populate complex phenotype models. A preliminary evaluation suggests that the tool is able to validate previously submitted cases and is very fast in searching phenotypes. Interpretation: Using SemEHR, a query such as "find patients with a family history of hepatitis C", which previously might have required the user to have natural language processing expertise, becomes a simple search, for which SemEHR retrieves a relevant patient cohort, populates patient-level summaries, and provides a link to each mention in the original source. Results and feedback from the multiple studies have proven its efficiency: previously weeks or months of work can be done within minutes in some cases. Funding: Medical Research Council (MC_PC_14089); NIHR Biomedical Research Centre for Mental Health, Biomedical Research Unit for Dementia; the European Union's Horizon 2020 (No 644753 KConnect); Wellcome Trust Seed Award in Science (109823/Z/15/Z); National Institute for Health Research University College London Hospital's Biomedical Research Centre; Arthritis Research UK; British Heart Foundation; Cancer Research UK; Chief Scientist Office; Economic and Social Research Council; Engineering and Physical Sciences Research Council; National Institute for Social Care and Health Research; Wellcome Trust (grant MR/K006584/1). … (more)
- Is Part Of:
- Lancet. Volume 390(2017)Supplement 3
- Journal:
- Lancet
- Issue:
- Volume 390(2017)Supplement 3
- Issue Display:
- Volume 390, Issue 3 (2017)
- Year:
- 2017
- Volume:
- 390
- Issue:
- 3
- Issue Sort Value:
- 2017-0390-0003-0000
- Page Start:
- S97
- Page End:
- Publication Date:
- 2017-11
- Subjects:
- Medicine -- Periodicals
Medicine -- Periodicals
Medicine
Medicine
Electronic journals
Periodicals
610.5 - Journal URLs:
- http://www.thelancet.com/ ↗
http://www.sciencedirect.com/science/journal/01406736 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/S0140-6736(17)33032-5 ↗
- Languages:
- English
- ISSNs:
- 0140-6736
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 5146.000000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 5388.xml