Breaking news detection from the web documents through text mining and seasonality. (2016)
- Record Type:
- Journal Article
- Title:
- Breaking news detection from the web documents through text mining and seasonality. (2016)
- Main Title:
- Breaking news detection from the web documents through text mining and seasonality
- Authors:
- Jishan, Syed Tanveer
Monsur, Md. Nuruddin
Rahman, Hafiz Abdur - Abstract:
- In recent years, news distribution through the internet has increased significantly and so does our growing dependency on online news sources. As vast numbers of web documents from different news websites are readily available, it is possible to extract information that can be used for various applications. One possible application is breaking news detection through text and property analysis of these web documents. In this paper, we presented an approach to detect breaking news from web documents by using keywords extraction through Brill's tagger and HTML tag attributes. Once the keywords are extracted, seasonality for each of the keywords are calculated by the ratio of the linear weighted moving averages (LWMA) at each point of the time series. Our approach has been validated and performance metrics have been evaluated with two online newspapers.
- Is Part Of:
- International journal of knowledge and web intelligence. Volume 5:Number 3(2016)
- Journal:
- International journal of knowledge and web intelligence
- Issue:
- Volume 5:Number 3(2016)
- Issue Display:
- Volume 5, Issue 3 (2016)
- Year:
- 2016
- Volume:
- 5
- Issue:
- 3
- Issue Sort Value:
- 2016-0005-0003-0000
- Page Start:
- 190
- Page End:
- 207
- Publication Date:
- 2016
- Subjects:
- information extraction -- web mining -- data cleaning -- time series analysis -- Brill tagger -- breaking news -- news detection -- web documents -- text mining -- seasonality -- data mining -- news distribution -- online news sources -- HTML tags -- keywords -- keyword extraction -- online newspapers
Information retrieval -- Periodicals
Web site development -- Periodicals
Internet programming -- Periodicals
004.67805 - Journal URLs:
- http://www.inderscience.com/browse/index.php?journalCODE=ijkwi ↗
http://www.inderscience.com/ ↗ - Languages:
- English
- ISSNs:
- 1755-8255
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 7828.xml