Predicting Hadoop misconfigurations using machine learning. (24th January 2020)
- Record Type:
- Journal Article
- Title:
- Predicting Hadoop misconfigurations using machine learning. (24th January 2020)
- Main Title:
- Predicting Hadoop misconfigurations using machine learning
- Authors:
- Robert, Andrew
Gupta, Apaar
Shenoy, Vinayak
Sitaram, Dinkar
Kalambur, Subramaniam - Abstract:
- Summary: Distributed applications are popular for heavy workloads where the resources of a single machine are not sufficient. These distributed applications come with many parameters to tune so that cluster resources can be effectively utilized. However, any misconfiguration of the available parameters may result in suboptimal performance of one or more machines in the cluster. These events may go unnoticed or can result in crashes. This problem of misconfigured parameters has no straightforward solution due to the variety of parameters and vastly different workloads being processed. In this article, we propose a methodology for machine learning‐based detection of misconfigurations. We collect data mined from system resource utilization, Hadoop logs, and job‐level metrics to train a model using decision tree and support vector machine. The models are used to identify whether a set of configuration parameters could result in a crash or a slowdown for a specific workload. The approach explained in this article can be extended to other distributed big data applications, such as Spark, Hive, Pig, and so on.
- Is Part Of:
- Software, practice & experience. Volume 50:Number 7(2020)
- Journal:
- Software, practice & experience
- Issue:
- Volume 50:Number 7(2020)
- Issue Display:
- Volume 50, Issue 7 (2020)
- Year:
- 2020
- Volume:
- 50
- Issue:
- 7
- Issue Sort Value:
- 2020-0050-0007-0000
- Page Start:
- 1168
- Page End:
- 1183
- Publication Date:
- 2020-01-24
- Subjects:
- big data -- Hadoop -- machine learning -- misconfiguration -- prediction
Computer software -- Periodicals
Computer programming -- Periodicals
Computer programs -- Periodicals
005.3 - Journal URLs:
- http://onlinelibrary.wiley.com/ ↗
- DOI:
- 10.1002/spe.2790 ↗
- Languages:
- English
- ISSNs:
- 0038-0644
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 8321.453000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 13194.xml