To what extent does content selection affect surface realization in the context of headline generation?. (May 2021)
- Record Type:
- Journal Article
- Title:
- To what extent does content selection affect surface realization in the context of headline generation?. (May 2021)
- Main Title:
- To what extent does content selection affect surface realization in the context of headline generation?
- Authors:
- Barros, Cristina
Vicente, Marta
Lloret, Elena - Abstract:
- Highlights: A NLG approach is analyzed for the task headline generation. Several content selection strategies are analyzed as macroplanning stage. An adapted version of HanaNLG is used for the surface realization stage. Coherent and structured headlines not present in the source news are obtained. HanaNLG-PLM headlines were among the top preferred in the human evaluation. Abstract: Headline generation is a task where the most important information of a news article is condensed and embodied into a single short sentence. This task is normally addressed by summarization techniques, ideally combining extractive and abstractive methods together with sentence compression or fusion techniques. Although Natural Language Generation (NLG) techniques have not been directly exploited for headline generation, they may provide better mechanisms than summarization techniques to paraphrase the information of a text. Therefore, this paper analyzes and evaluates the effectiveness of NLG techniques for generating headlines. In NLG, both content selection and surface realization are equally important—there is no point in generating text without knowing the topic. Considering this premise, we therefore take HanaNLG—a hybrid surface realization approach—as a basis, and we analyze the effect in the generated text when different content selection strategies are integrated at macroplanning stage. The experiments conducted show that, despite not using any sophisticated summarization method, theHighlights: A NLG approach is analyzed for the task headline generation. Several content selection strategies are analyzed as macroplanning stage. An adapted version of HanaNLG is used for the surface realization stage. Coherent and structured headlines not present in the source news are obtained. HanaNLG-PLM headlines were among the top preferred in the human evaluation. Abstract: Headline generation is a task where the most important information of a news article is condensed and embodied into a single short sentence. This task is normally addressed by summarization techniques, ideally combining extractive and abstractive methods together with sentence compression or fusion techniques. Although Natural Language Generation (NLG) techniques have not been directly exploited for headline generation, they may provide better mechanisms than summarization techniques to paraphrase the information of a text. Therefore, this paper analyzes and evaluates the effectiveness of NLG techniques for generating headlines. In NLG, both content selection and surface realization are equally important—there is no point in generating text without knowing the topic. Considering this premise, we therefore take HanaNLG—a hybrid surface realization approach—as a basis, and we analyze the effect in the generated text when different content selection strategies are integrated at macroplanning stage. The experiments conducted show that, despite not using any sophisticated summarization method, the proposed approach provided the following benefits: i) it generated a coherent, linguistically structured headline; ii) it obtained results on standard datasets (i.e., DUC 2003 and DUC 2004) that were comparable to several competitive systems, in terms of the content of the generated headline; and, iii) the headlines generated by the whole approach (PLM-HanaNLG) were preferred by human assessors compared to those generated by the best performing system in DUC 2003. … (more)
- Is Part Of:
- Computer speech & language. Volume 67(2021)
- Journal:
- Computer speech & language
- Issue:
- Volume 67(2021)
- Issue Display:
- Volume 67, Issue 2021 (2021)
- Year:
- 2021
- Volume:
- 67
- Issue:
- 2021
- Issue Sort Value:
- 2021-0067-2021-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-05
- Subjects:
- Natural language generation -- Headline generation -- Positional language models -- Factored language models -- Content selection -- Abstractive summarization
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2020.101179 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 21622.xml