HAMR: A dataflow-based real-time in-memory cluster computing engine. (September 2017)

Record Type:: Journal Article
Title:: HAMR: A dataflow-based real-time in-memory cluster computing engine. (September 2017)
Main Title:: HAMR: A dataflow-based real-time in-memory cluster computing engine
Authors:: Wu, Yao
Zheng, Long
Heilig, Brian
Gao, Guang R
Other Names:: Balaji Pavan guest-editor.
Huang Zhiyi guest-editor.
Abstract:: As the attention given to big data grows, cluster computing systems for distributed processing of large data sets become the mainstream and critical requirement in high performance distributed system research. One of the most successful systems is Hadoop, which uses MapReduce as a programming/execution model and takes disks as intermedia to process huge volumes of data. Spark, as an in-memory computing engine, can solve the iterative and interactive problems more efficiently. However, currently it is a consensus that they are not the final solutions to big data due to a MapReduce-like programming model, synchronous execution model and the constraint that only supports batch processing, and so on. A new solution, especially, a fundamental evolution is needed to bring big data solutions into a new era. In this paper, we introduce a new cluster computing system called HAMR which supports both batch and streaming processing. To achieve better performance, HAMR integrates high performance computing approaches, i.e. dataflow fundamental into a big data solution. With more specifications, HAMR is fully designed based on in-memory computing to reduce the unnecessary disk access overhead; task scheduling and memory management are in fine-grain manner to explore more parallelism; asynchronous execution improves efficiency of computation resource usage, and also makes workload balance across the whole cluster better. The experimental results show that HAMR can outperform Hadoop … (more)
Is Part Of:: International journal of high performance computing applications. Volume 31:Number 5(2017)
Journal:: International journal of high performance computing applications
Issue:: Volume 31:Number 5(2017)
Issue Display:: Volume 31, Issue 5 (2017)
Year:: 2017
Volume:: 31
Issue:: 5
Issue Sort Value:: 2017-0031-0005-0000
Page Start:: 361
Page End:: 374
Publication Date:: 2017-09
Subjects:: Dataflow -- in-memory computing -- fine-grain -- big data -- runtime
High performance computing -- Periodicals
Supercomputers -- Periodicals
004.1105
Journal URLs:: http://hpc.sagepub.com ↗
http://www.uk.sagepub.com/home.nav ↗
http://firstsearch.oclc.org ↗
DOI:: 10.1177/1094342016672080 ↗
Languages:: English
ISSNs:: 1094-3420
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 7705.xml