Mining non-lattice subgraphs for detecting missing hierarchical relations and concepts in SNOMED CT. (19th February 2017)
- Record Type:
- Journal Article
- Title:
- Mining non-lattice subgraphs for detecting missing hierarchical relations and concepts in SNOMED CT. (19th February 2017)
- Main Title:
- Mining non-lattice subgraphs for detecting missing hierarchical relations and concepts in SNOMED CT
- Authors:
- Cui, Licong
Zhu, Wei
Tao, Shiqiang
Case, James T
Bodenreider, Olivier
Zhang, Guo-Qiang - Abstract:
- Abstract: Objective: Quality assurance of large ontological systems such as SNOMED CT is an indispensable part of the terminology management lifecycle. We introduce a hybrid structural-lexical method for scalable and systematic discovery of missing hierarchical relations and concepts in SNOMED CT. Material and Methods: All non-lattice subgraphs (the structural part) in SNOMED CT are exhaustively extracted using a scalable MapReduce algorithm. Four lexical patterns (the lexical part) are identified among the extracted non-lattice subgraphs. Non-lattice subgraphs exhibiting such lexical patterns are often indicative of missing hierarchical relations or concepts. Each lexical pattern is associated with a potential specific type of error. Results: Applying the structural-lexical method to SNOMED CT (September 2015 US edition), we found 6801 non-lattice subgraphs that matched these lexical patterns, of which 2046 were amenable to visual inspection. We evaluated a random sample of 100 small subgraphs, of which 59 were reviewed in detail by domain experts. All the subgraphs reviewed contained errors confirmed by the experts. The most frequent type of error was missing is-a relations due to incomplete or inconsistent modeling of the concepts. Conclusions: Our hybrid structural-lexical method is innovative and proved effective not only in detecting errors in SNOMED CT, but also in suggesting remediation for these errors.
- Is Part Of:
- Journal of the American Medical Informatics Association. Volume 24:Number 4(2017:Jul.)
- Journal:
- Journal of the American Medical Informatics Association
- Issue:
- Volume 24:Number 4(2017:Jul.)
- Issue Display:
- Volume 24, Issue 4 (2017)
- Year:
- 2017
- Volume:
- 24
- Issue:
- 4
- Issue Sort Value:
- 2017-0024-0004-0000
- Page Start:
- 788
- Page End:
- 798
- Publication Date:
- 2017-02-19
- Subjects:
- SNOMED CT -- ontology -- quality assurance -- non-lattice subgraph
Medical informatics -- Periodicals
Information Services -- Periodicals
Medical Informatics -- Periodicals
Médecine -- Informatique -- Périodiques
Informatica
Geneeskunde
Informatique médicale
Computer network resources
Electronic journals
610.285 - Journal URLs:
- http://jamia.bmj.com/ ↗
http://www.jamia.org ↗
http://www.pubmedcentral.nih.gov/tocrender.fcgi?journal=76 ↗
http://www.sciencedirect.com/science/journal/10675027 ↗
http://jamia.oxfordjournals.org/ ↗
http://www.oxfordjournals.org/en/ ↗ - DOI:
- 10.1093/jamia/ocw175 ↗
- Languages:
- English
- ISSNs:
- 1067-5027
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4689.025000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 15137.xml