Annotation of phenotypes using ontologies: a gold standard for the training and evaluation of natural language processing systems. (14th December 2018)

Record Type:: Journal Article
Title:: Annotation of phenotypes using ontologies: a gold standard for the training and evaluation of natural language processing systems. (14th December 2018)
Main Title:: Annotation of phenotypes using ontologies: a gold standard for the training and evaluation of natural language processing systems
Authors:: Dahdul, Wasila
Manda, Prashanti
Cui, Hong
Balhoff, James P
Dececchi, T Alexander
Ibrahim, Nizar
Lapp, Hilmar
Vision, Todd
Mabee, Paula M
Abstract:: Abstract: Natural language descriptions of organismal phenotypes, a principal object of study in biology, are abundant in the biological literature. Expressing these phenotypes as logical statements using ontologies would enable large-scale analysis on phenotypic information from diverse systems. However, considerable human effort is required to make these phenotype descriptions amenable to machine reasoning. Natural language processing tools have been developed to facilitate this task, and the training and evaluation of these tools depend on the availability of high quality, manually annotated gold standard data sets. We describe the development of an expert-curated gold standard data set of annotated phenotypes for evolutionary biology. The gold standard was developed for the curation of complex comparative phenotypes for the Phenoscape project. It was created by consensus among three curators and consists of entity–quality expressions of varying complexity. We use the gold standard to evaluate annotations created by human curators and those generated by the Semantic CharaParser tool. Using four annotation accuracy metrics that can account for any level of relationship between terms from two phenotype annotations, we found that machine–human consistency, or similarity, was significantly lower than inter-curator (human–human) consistency. Surprisingly, allowing curatorsaccess to external information did not significantly increase the similarity of their annotations to the … (more)
Is Part Of:: Database. Volume 2018(2018)
Journal:: Database
Issue:: Volume 2018(2018)
Issue Display:: Volume 2018, Issue 2018 (2018)
Year:: 2018
Volume:: 2018
Issue:: 2018
Issue Sort Value:: 2018-2018-2018-0000
Page Start:
Page End:
Publication Date:: 2018-12-14
Subjects:: Biology -- Databases -- Periodicals
Bioinformatics -- Periodicals
570.285
Journal URLs:: http://database.oxfordjournals.org/ ↗
http://ukcatalogue.oup.com/ ↗
DOI:: 10.1093/database/bay110 ↗
Languages:: English
ISSNs:: 1758-0463
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 12284.xml