Automatic keyphrase annotation of scientific documents using Wikipedia and genetic algorithms. (June 2013)
- Record Type:
- Journal Article
- Title:
- Automatic keyphrase annotation of scientific documents using Wikipedia and genetic algorithms. (June 2013)
- Main Title:
- Automatic keyphrase annotation of scientific documents using Wikipedia and genetic algorithms
- Authors:
- Joorabchi, Arash
Mahdi, Abdulhussain E. - Abstract:
- Topical annotation of documents with keyphrases is a proven method for revealing the subject of scientific and research documents to both human readers and information retrieval systems. This article describes a machine learning-based keyphrase annotation method for scientific documents that utilizes Wikipedia as a thesaurus for candidate selection from documents' content. We have devised a set of 20 statistical, positional and semantical features for candidate phrases to capture and reflect various properties of those candidates that have the highest keyphraseness probability. We first introduce a simple unsupervised method for ranking and filtering the most probable keyphrases, and then evolve it into a novel supervised method using genetic algorithms. We have evaluated the performance of both methods on a third-party dataset of research papers. Reported experimental results show that the performance of our proposed methods, measured in terms of consistency with human annotators, is on a par with that achieved by humans and outperforms rival supervised and unsupervised methods.
- Is Part Of:
- Journal of information science. Volume 39:Number 3(2013)
- Journal:
- Journal of information science
- Issue:
- Volume 39:Number 3(2013)
- Issue Display:
- Volume 39, Issue 3 (2013)
- Year:
- 2013
- Volume:
- 39
- Issue:
- 3
- Issue Sort Value:
- 2013-0039-0003-0000
- Page Start:
- 410
- Page End:
- 426
- Publication Date:
- 2013-06
- Subjects:
- genetic algorithms -- keyphrase annotation -- keyphrase indexing -- metadata generation -- scientific digital libraries -- subject metadata -- text mining -- Wikipedia
Information science -- Periodicals
Information science
Periodicals
020.5 - Journal URLs:
- http://jis.sagepub.com/archive/ ↗
http://www.ingenta.com/journals/browse/bks/jis?mode=direct ↗
http://www.uk.sagepub.com/home.nav ↗
http://firstsearch.oclc.org ↗
http://firstsearch.oclc.org/journal=0165-5515;screen=info;ECOIP ↗ - DOI:
- 10.1177/0165551512472138 ↗
- Languages:
- English
- ISSNs:
- 0165-5515
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 25785.xml