Watershed‐ng: an extensible distributed stream processing framework. (28th January 2016)
- Record Type:
- Journal Article
- Title:
- Watershed‐ng: an extensible distributed stream processing framework. (28th January 2016)
- Main Title:
- Watershed‐ng: an extensible distributed stream processing framework
- Authors:
- Rocha, Rodrigo
Hott, Bruno
Dias, Vinícius
Ferreira, Renato
Meira, Wagner
Guedes, Dorgival - Other Names:
- Silla Federico guestEditor.
Fröning Holger guestEditor.
Senger Hermes guestEditor.
Geyer Claudio guestEditor. - Abstract:
- Summary: Most high‐performance data processing (a.k.a. big data) systems allow users to express their computation using abstractions (like MapReduce), which simplify the extraction of parallelism from applications. Most frameworks, however, do not allow users to specify how communication must take place: That element is deeply embedded into the run‐time system abstractions, making changes hard to implement. In this work, we describe Wathershed‐ng, our re‐engineering of the Watershed system, a framework based on the filter–stream paradigm and originally focused on continuous stream processing. Like other big‐data environments, Watershed provided object‐oriented abstractions to express computation (filters), but the implementation of streams was a run‐time system element. By isolating stream functionality into appropriate classes, combination of communication patterns and reuse of common message handling functions (like compression and blocking) become possible. The new architecture even allows the design of new communication patterns, for example, allowing users to choose MPI, TCP, or shared memory implementations of communication channels as their problem demands. Applications designed for the new interface showed reductions in code size on the order of 50 % and above in some cases. The performance results also showed significant improvements, because some implementation bottlenecks were removed in the re‐engineering process. Copyright © 2016 John Wiley & Sons, Ltd.
- Is Part Of:
- Concurrency and computation. Volume 28:Number 8(2016)
- Journal:
- Concurrency and computation
- Issue:
- Volume 28:Number 8(2016)
- Issue Display:
- Volume 28, Issue 8 (2016)
- Year:
- 2016
- Volume:
- 28
- Issue:
- 8
- Issue Sort Value:
- 2016-0028-0008-0000
- Page Start:
- 2487
- Page End:
- 2502
- Publication Date:
- 2016-01-28
- Subjects:
- distributed systems -- watershed -- big data -- frameworks
Parallel processing (Electronic computers) -- Periodicals
Parallel computers -- Periodicals
004.35 - Journal URLs:
- http://onlinelibrary.wiley.com/ ↗
- DOI:
- 10.1002/cpe.3779 ↗
- Languages:
- English
- ISSNs:
- 1532-0626
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3405.622000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 954.xml