ATSSC: Development of an approach based on soft computing for text summarization. (January 2017)
- Record Type:
- Journal Article
- Title:
- ATSSC: Development of an approach based on soft computing for text summarization. (January 2017)
- Main Title:
- ATSSC: Development of an approach based on soft computing for text summarization
- Authors:
- Tayal, Madhuri A.
Raghuwanshi, Mukesh M.
Malik, Latesh G. - Abstract:
- Highlights: We developed the TIDA (Title Identification) Algorithm. It uses its own title identification mechanism using the web. The overall method has its own mechanism for sentence reduction and combination. We designed Semantic Word and Sentence Similarity mechanism for sentence reduction. We used Word disambiguation through parser mechanism. Tag based training as well as SVO rules were used for combining sentences. Abstract: Natural Language Processing (NLP) is a field of computer science and linguistics concerned with the unique conversation between computers and human languages. It processes data through Lexical analysis, Syntax analysis, Semantic analysis, Discourse processing and Pragmatic analysis. An intelligent text summarization is one of the most challenging tasks in Natural language processing. It can be further used for applications like storytelling and question answering. This paper presents an automatic text summarizer for text documents using soft computing approach, consisting of SVO (Subject, Verb, and Object) Rules and Tag based training. This approach processes data through POS Tagger, NLP Parser, ambiguity removal, Semantic Representation, Sentence Reduction and Sentence Combination. At first, this paper defines the theme (title) of the document. After this operation, it preprocesses text document to perform pronominal reference resolution and text clustering. After these preprocessing operations, it identifies and removes ambiguity from the languageHighlights: We developed the TIDA (Title Identification) Algorithm. It uses its own title identification mechanism using the web. The overall method has its own mechanism for sentence reduction and combination. We designed Semantic Word and Sentence Similarity mechanism for sentence reduction. We used Word disambiguation through parser mechanism. Tag based training as well as SVO rules were used for combining sentences. Abstract: Natural Language Processing (NLP) is a field of computer science and linguistics concerned with the unique conversation between computers and human languages. It processes data through Lexical analysis, Syntax analysis, Semantic analysis, Discourse processing and Pragmatic analysis. An intelligent text summarization is one of the most challenging tasks in Natural language processing. It can be further used for applications like storytelling and question answering. This paper presents an automatic text summarizer for text documents using soft computing approach, consisting of SVO (Subject, Verb, and Object) Rules and Tag based training. This approach processes data through POS Tagger, NLP Parser, ambiguity removal, Semantic Representation, Sentence Reduction and Sentence Combination. At first, this paper defines the theme (title) of the document. After this operation, it preprocesses text document to perform pronominal reference resolution and text clustering. After these preprocessing operations, it identifies and removes ambiguity from the language using parser. And then, it calculates the score for the sentences using the title of the document, Semantic Sentence Similarity utility and n-gram Co-Occurrence relations of the words in a particular sentence. At last, sentences are combined with the SVO Rules after providing tag based training for simple and complex sentences. The summarizer was tested on the standard DUC 2007 dataset as well as a corpus of hundred text documents of different domains created by us. DUC 2007 Update Task produced accuracy F-scores of 0.13523 (ROUGE-2) and 0.112561 (ROUGE-SU4) for DUC 2007 documents and 0.4036 (ROUGE-2) and 0.3129 (ROUGE-SU4) for our corpus. Subjective evaluation was carried out by five language experts and twenty random individuals for system generated sample summaries. … (more)
- Is Part Of:
- Computer speech & language. Volume 41(2016)
- Journal:
- Computer speech & language
- Issue:
- Volume 41(2016)
- Issue Display:
- Volume 41, Issue 2016 (2016)
- Year:
- 2016
- Volume:
- 41
- Issue:
- 2016
- Issue Sort Value:
- 2016-0041-2016-0000
- Page Start:
- 214
- Page End:
- 235
- Publication Date:
- 2017-01
- Subjects:
- Text document -- Summarization -- Semantic representation -- Clustering -- Reference resolution -- Evaluation
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2016.07.002 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 2481.xml