Automatic cohesive summarization with pronominal anaphora resolution. (November 2018)
- Record Type:
- Journal Article
- Title:
- Automatic cohesive summarization with pronominal anaphora resolution. (November 2018)
- Main Title:
- Automatic cohesive summarization with pronominal anaphora resolution
- Authors:
- Antunes, Jamilson
Lins, Rafael Dueire
Lima, Rinaldo
Oliveira, Hilário
Riss, Marcelo
Simske, Steven J. - Abstract:
- Abstract: Automatic Text Summarization is the process of creating a compressed representation of one or more related documents, keeping only the most valuable information. The extractive approach for summarization is the most studied and aims to generate a compressed version of a document by identifying, ranking, and selecting the most relevant sentences or phrases from a text. The selected sentences go verbatim into the summary. However, this strategy may yield incoherent summaries, as pronominal coreferences may appear unbound. To alleviate this problem, this paper proposes a method that solves unbound pronominal anaphoric expressions, automatically enabling the cohesiveness of the extractive summaries. The proposed method can be applied to two distinct scenarios. The first one aims to find and fix unbound anaphoric expressions present in the generated summaries at a post-processing stage; whereas the second one is performed at the preprocessing stage of the proposed pipeline and generates an intermediate version of the input document that resolves the unbound pronominal coreferences. The proposed solution was evaluated on the CNN news corpus using the seventeen summarization techniques most widely acknowledged in the literature and four state-of-the-art summarization systems. Moreover, it also provides a comparative evaluation concerning two distinct assessment scenarios which are compared to a baseline. The experiments performed achieved very encouraging quantitative andAbstract: Automatic Text Summarization is the process of creating a compressed representation of one or more related documents, keeping only the most valuable information. The extractive approach for summarization is the most studied and aims to generate a compressed version of a document by identifying, ranking, and selecting the most relevant sentences or phrases from a text. The selected sentences go verbatim into the summary. However, this strategy may yield incoherent summaries, as pronominal coreferences may appear unbound. To alleviate this problem, this paper proposes a method that solves unbound pronominal anaphoric expressions, automatically enabling the cohesiveness of the extractive summaries. The proposed method can be applied to two distinct scenarios. The first one aims to find and fix unbound anaphoric expressions present in the generated summaries at a post-processing stage; whereas the second one is performed at the preprocessing stage of the proposed pipeline and generates an intermediate version of the input document that resolves the unbound pronominal coreferences. The proposed solution was evaluated on the CNN news corpus using the seventeen summarization techniques most widely acknowledged in the literature and four state-of-the-art summarization systems. Moreover, it also provides a comparative evaluation concerning two distinct assessment scenarios which are compared to a baseline. The experiments performed achieved very encouraging quantitative and qualitative results. … (more)
- Is Part Of:
- Computer speech & language. Volume 52(2018)
- Journal:
- Computer speech & language
- Issue:
- Volume 52(2018)
- Issue Display:
- Volume 52, Issue 2018 (2018)
- Year:
- 2018
- Volume:
- 52
- Issue:
- 2018
- Issue Sort Value:
- 2018-0052-2018-0000
- Page Start:
- 141
- Page End:
- 164
- Publication Date:
- 2018-11
- Subjects:
- Automatic summarization -- Cohesive summarization -- Extractive summarization -- Anaphoric expressions
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2018.05.004 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 17055.xml