A PDTB-styled end-to-end discourse parser. (April 2014)
- Record Type:
- Journal Article
- Title:
- A PDTB-styled end-to-end discourse parser. (April 2014)
- Main Title:
- A PDTB-styled end-to-end discourse parser
- Authors:
- LIN, ZIHENG
NG, HWEE TOU
KAN, MIN-YEN - Abstract:
- <abstract abstract-type="normal"> <title>Abstract</title> <p>Since the release of the large discourse-level annotation of the Penn Discourse Treebank (PDTB), research work has been carried out on certain subtasks of this annotation, such as disambiguating discourse connectives and classifying Explicit or Implicit relations. We see a need to construct a full parser on top of these subtasks and propose a way to evaluate the parser. In this work, we have designed and developed an end-to-end discourse parser-to-parse free texts in the PDTB style in a fully data-driven approach. The parser consists of multiple components joined in a sequential pipeline architecture, which includes a connective classifier, argument labeler, explicit classifier, non-explicit classifier, and attribution span labeler. Our trained parser first identifies all discourse and non-discourse relations, locates and labels their arguments, and then classifies the sense of the relation between each pair of arguments. For the identified relations, the parser also determines the attribution spans, if any, associated with them. We introduce novel approaches to locate and label arguments, and to identify attribution spans. We also significantly improve on the current state-of-the-art connective classifier. We propose and present a comprehensive evaluation from both component-wise and error-cascading perspectives, in which we illustrate how each component performs in isolation, as well as how the pipeline performs<abstract abstract-type="normal"> <title>Abstract</title> <p>Since the release of the large discourse-level annotation of the Penn Discourse Treebank (PDTB), research work has been carried out on certain subtasks of this annotation, such as disambiguating discourse connectives and classifying Explicit or Implicit relations. We see a need to construct a full parser on top of these subtasks and propose a way to evaluate the parser. In this work, we have designed and developed an end-to-end discourse parser-to-parse free texts in the PDTB style in a fully data-driven approach. The parser consists of multiple components joined in a sequential pipeline architecture, which includes a connective classifier, argument labeler, explicit classifier, non-explicit classifier, and attribution span labeler. Our trained parser first identifies all discourse and non-discourse relations, locates and labels their arguments, and then classifies the sense of the relation between each pair of arguments. For the identified relations, the parser also determines the attribution spans, if any, associated with them. We introduce novel approaches to locate and label arguments, and to identify attribution spans. We also significantly improve on the current state-of-the-art connective classifier. We propose and present a comprehensive evaluation from both component-wise and error-cascading perspectives, in which we illustrate how each component performs in isolation, as well as how the pipeline performs with errors propagated forward. The parser gives an overall system <italic>F</italic><sub>1</sub> score of 46.80 percent for partial matching utilizing gold standard parses, and 38.18 percent with full automation.</p> </abstract> … (more)
- Is Part Of:
- Natural language engineering. Volume 20:Part 2(2014)
- Journal:
- Natural language engineering
- Issue:
- Volume 20:Part 2(2014)
- Issue Display:
- Volume 20, Issue 2, Part 2 (2014)
- Year:
- 2014
- Volume:
- 20
- Issue:
- 2
- Part:
- 2
- Issue Sort Value:
- 2014-0020-0002-0002
- Page Start:
- 151
- Page End:
- 184
- Publication Date:
- 2014-04
- Subjects:
- Natural language processing (Computer science) -- Periodicals
Software engineering -- Periodicals
006.35 - Journal URLs:
- http://journals.cambridge.org/action/displayJournal?jid=NLE ↗
- DOI:
- 10.1017/S1351324912000307 ↗
- Languages:
- English
- ISSNs:
- 1351-3249
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library HMNTS - ELD Digital store
- Ingest File:
- 4229.xml