Klarigi: Characteristic explanations for semantic biomedical data. (February 2023)
- Record Type:
- Journal Article
- Title:
- Klarigi: Characteristic explanations for semantic biomedical data. (February 2023)
- Main Title:
- Klarigi: Characteristic explanations for semantic biomedical data
- Authors:
- Slater, Luke T.
Williams, John A.
Schofield, Paul N.
Russell, Sophie
Pendleton, Samantha C.
Karwath, Andreas
Fanning, Hilary
Ball, Simon
Hoehndorf, Robert
Gkoutos, Georgios V. - Abstract:
- Abstract: Annotation of biomedical entities with ontology classes provides for formal semantic analysis and mobilisation of background knowledge in determining their relationships. To date, enrichment analysis has been routinely employed to identify classes that are over-represented in annotations across sets of groups, such as biosample gene expression profiles or patient phenotypes, and is useful for a range of tasks including differential diagnosis and causative variant prioritisation. These approaches, however, usually consider only univariate relationships, make limited use of the semantic features of ontologies, and provide limited information and evaluation of the explanatory power of both singular and grouped candidate classes. Moreover, they are not designed to solve the problem of deriving cohesive, characteristic, and discriminatory sets of classes for entity groups. We have developed a new tool, called Klarigi, which introduces multiple scoring heuristics for identification of classes that are both compositional and discriminatory for groups of entities annotated with ontology classes. The tool includes a novel algorithm for derivation of multivariable semantic explanations for entity groups, makes use of semantic inference through live use of an ontology reasoner, and includes a classification method for identifying the discriminatory power of candidate sets, in addition to significance testing apposite to traditional enrichment approaches. We describe theAbstract: Annotation of biomedical entities with ontology classes provides for formal semantic analysis and mobilisation of background knowledge in determining their relationships. To date, enrichment analysis has been routinely employed to identify classes that are over-represented in annotations across sets of groups, such as biosample gene expression profiles or patient phenotypes, and is useful for a range of tasks including differential diagnosis and causative variant prioritisation. These approaches, however, usually consider only univariate relationships, make limited use of the semantic features of ontologies, and provide limited information and evaluation of the explanatory power of both singular and grouped candidate classes. Moreover, they are not designed to solve the problem of deriving cohesive, characteristic, and discriminatory sets of classes for entity groups. We have developed a new tool, called Klarigi, which introduces multiple scoring heuristics for identification of classes that are both compositional and discriminatory for groups of entities annotated with ontology classes. The tool includes a novel algorithm for derivation of multivariable semantic explanations for entity groups, makes use of semantic inference through live use of an ontology reasoner, and includes a classification method for identifying the discriminatory power of candidate sets, in addition to significance testing apposite to traditional enrichment approaches. We describe the design and implementation of Klarigi, including its scoring and explanation determination methods, and evaluate its use in application to two test cases with clinical significance, comparing and contrasting methods and results with literature-based and enrichment analysis methods. We demonstrate that Klarigi produces characteristic and discriminatory explanations for groups of biomedical entities in two settings. We also show that these explanations recapitulate and extend the knowledge held in existing biomedical databases and literature for several diseases. We conclude that Klarigi provides a distinct and valuable perspective on biomedical datasets when compared with traditional enrichment methods, and therefore constitutes a new method by which biomedical datasets can be explored, contributing to improved insight into semantic data. Highlights: Enrichment analysis is used to identify over-represented classes in semantic profiles. Enrichment does not identify discriminatory and characteristic explanations for groups. We solve these problems with multiple metrics and a modularisation approach. We describe and evaluate application to two use cases with clinical significance. We show that our approach provides a valuable perspective on biomedical data. … (more)
- Is Part Of:
- Computers in biology and medicine. Volume 153(2023)
- Journal:
- Computers in biology and medicine
- Issue:
- Volume 153(2023)
- Issue Display:
- Volume 153, Issue 2023 (2023)
- Year:
- 2023
- Volume:
- 153
- Issue:
- 2023
- Issue Sort Value:
- 2023-0153-2023-0000
- Page Start:
- Page End:
- Publication Date:
- 2023-02
- Subjects:
- Semantic explanation -- Ontology -- Semantic analysis -- Enrichment analysis -- Phenotypes -- Explicability -- Phenotype profiles
Medicine -- Data processing -- Periodicals
Biology -- Data processing -- Periodicals
610.285 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00104825/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compbiomed.2022.106425 ↗
- Languages:
- English
- ISSNs:
- 0010-4825
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.880000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 25099.xml