A distributed architecture for large scale news and social media processing. (22nd March 2021)
- Record Type:
- Journal Article
- Title:
- A distributed architecture for large scale news and social media processing. (22nd March 2021)
- Main Title:
- A distributed architecture for large scale news and social media processing
- Authors:
- Varlamis, Iraklis
Michail, Dimitrios
Polydoras, Pavlos
Tsantilas, Panagiotis - Abstract:
- When designing a data processing and analytics pipeline for data streams, it is important to provide the data load and be able to successfully balance it over the available resources. This can be achieved more easily if small processing modules, which require limited resources, replace large monolithic processing software. In this work, we present the case of a social media and news analytics platform, called PaloAnalytics, which performs a series of content aggregation, information extraction (e.g., NER, sentiment tagging, etc.) and visualisation tasks in a large amount of data, on a daily basis. We demonstrate the architecture of the platform that relies on micro-modules and message-oriented middleware for delivering distributed content processing. Early results show that the proposed architecture can easily stand the increased content load that occasionally occurs in social media (e.g., when a major event takes place) and quickly release unused resources when the content load reaches its normal flow.
- Is Part Of:
- International journal of Web engineering and technology. Volume 15:Number 4(2020)
- Journal:
- International journal of Web engineering and technology
- Issue:
- Volume 15:Number 4(2020)
- Issue Display:
- Volume 15, Issue 4 (2020)
- Year:
- 2020
- Volume:
- 15
- Issue:
- 4
- Issue Sort Value:
- 2020-0015-0004-0000
- Page Start:
- 383
- Page End:
- 406
- Publication Date:
- 2021-03-22
- Subjects:
- web crawler -- text processing -- distributed data processing -- message-oriented middleware
World Wide Web -- Periodicals
Web site development -- Periodicals
Application software -- Development -- Periodicals
006.7 - Journal URLs:
- http://www.inderscience.com/jhome.php?jcode=ijwet ↗
http://www.inderscience.com/ ↗ - Languages:
- English
- ISSNs:
- 1476-1289
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 15236.xml