From French Wikipedia to Erudit: A test case for cross‐domain open information extraction. (30th June 2017)
- Record Type:
- Journal Article
- Title:
- From French Wikipedia to Erudit: A test case for cross‐domain open information extraction. (30th June 2017)
- Main Title:
- From French Wikipedia to Erudit: A test case for cross‐domain open information extraction
- Authors:
- Gotti, Fabrizio
Langlais, Philippe - Abstract:
- Abstract: In this paper, we describe an open information extraction pipeline based onReVerb for extracting knowledge from French text. We put it to the test by using the information triples extracted to build an entity classifier, ie, a system able to label a given instance with its type (for instance, Michel Foucault is a philosopher). The classifier requires little supervision. One novel aspect of this study is that we show how general domain information triples (extracted from French Wikipedia) can be used for deriving new knowledge from domain‐specific documents unrelated to Wikipedia, in our case scholarly articles focusing on the humanities. We believe that the present study is the first that focuses on such a cross‐domain, recall‐oriented approach in open information extraction. While our system's performance shows room for improvement, manual assessments show that the task is quite hard, even for a human, in part because of the cross‐domain aspect of the problem we tackle.
- Is Part Of:
- Computational intelligence. Volume 34:Number 2(2018)
- Journal:
- Computational intelligence
- Issue:
- Volume 34:Number 2(2018)
- Issue Display:
- Volume 34, Issue 2 (2018)
- Year:
- 2018
- Volume:
- 34
- Issue:
- 2
- Issue Sort Value:
- 2018-0034-0002-0000
- Page Start:
- 420
- Page End:
- 439
- Publication Date:
- 2017-06-30
- Subjects:
- entity classification -- named entities -- natural language processing -- open information extraction
Artificial intelligence -- Periodicals
Computational linguistics -- Periodicals
006.3 - Journal URLs:
- http://www.blackwellpublishing.com/journal.asp?ref=0824-7935&site=1 ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1111/coin.12120 ↗
- Languages:
- English
- ISSNs:
- 0824-7935
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3390.595000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 6677.xml