Hadoop MapReduce v2 cookbook : explore the Hadoop MapReduce v2 ecosystem to gain insights from very large datasets /: explore the Hadoop MapReduce v2 ecosystem to gain insights from very large datasets. (2015)
- Record Type:
- Book
- Title:
- Hadoop MapReduce v2 cookbook : explore the Hadoop MapReduce v2 ecosystem to gain insights from very large datasets /: explore the Hadoop MapReduce v2 ecosystem to gain insights from very large datasets. (2015)
- Main Title:
- Hadoop MapReduce v2 cookbook : explore the Hadoop MapReduce v2 ecosystem to gain insights from very large datasets
- Other Titles:
- Explore the Hadoop MapReduce v2 ecosystem to gain insights from very large datasets
- Further Information:
- Note: Thilina Gunarathne.
- Authors:
- Gunarathne, Thilina
- Contents:
- Cover; Copyright; Credits; About the Author; Acknowledgments; About the Author; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Getting Started with Hadoop v2; Introduction; Setting up Hadoop v2 on your local machine; Writing a WordCount MapReduce application, bundling it, and running it using Hadoop local mode; Adding a combiner step to the WordCount MapReduce program; Setting up HDFS; Setting up Hadoop YARN in a distributed cluster environment using Hadoop v2; Setting up Hadoop ecosystem in a distributed cluster environment using a Hadoop distribution HDFS command-line file operationsRunning the WordCount program in a distributed cluster environment; Benchmarking HDFS using DFSIO; Benchmarking Hadoop MapReduce using TeraSort; Chapter 2: Cloud Deployments -- Using Hadoop YARN on Cloud Environments; Introduction; Running Hadoop MapReduce v2 computations using Amazon Elastic MapReduce; Saving money using Amazon EC2 Spot Instances to execute EMR job flows; Executing a Pig script using EMR; Executing a Hive script using EMR; Creating an Amazon EMR job flow using the AWS Command Line Interface Deploying an Apache HBase cluster on Amazon EC2 using EMRUsing EMR bootstrap actions to configure VMs for the Amazon EMR jobs; Using Apache Whirr to deploy an Apache Hadoop cluster in a cloud environment; Chapter 3: Hadoop Essentials -- Configurations, Unit Tests, and Other APIs; Introduction; Optimizing Hadoop YARN and MapReduce configurations for clusterCover; Copyright; Credits; About the Author; Acknowledgments; About the Author; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Getting Started with Hadoop v2; Introduction; Setting up Hadoop v2 on your local machine; Writing a WordCount MapReduce application, bundling it, and running it using Hadoop local mode; Adding a combiner step to the WordCount MapReduce program; Setting up HDFS; Setting up Hadoop YARN in a distributed cluster environment using Hadoop v2; Setting up Hadoop ecosystem in a distributed cluster environment using a Hadoop distribution HDFS command-line file operationsRunning the WordCount program in a distributed cluster environment; Benchmarking HDFS using DFSIO; Benchmarking Hadoop MapReduce using TeraSort; Chapter 2: Cloud Deployments -- Using Hadoop YARN on Cloud Environments; Introduction; Running Hadoop MapReduce v2 computations using Amazon Elastic MapReduce; Saving money using Amazon EC2 Spot Instances to execute EMR job flows; Executing a Pig script using EMR; Executing a Hive script using EMR; Creating an Amazon EMR job flow using the AWS Command Line Interface Deploying an Apache HBase cluster on Amazon EC2 using EMRUsing EMR bootstrap actions to configure VMs for the Amazon EMR jobs; Using Apache Whirr to deploy an Apache Hadoop cluster in a cloud environment; Chapter 3: Hadoop Essentials -- Configurations, Unit Tests, and Other APIs; Introduction; Optimizing Hadoop YARN and MapReduce configurations for cluster deployments; Shared user Hadoop clusters -- using Fair and Capacity schedulers; Setting classpath precedence to user-provided JARs; Speculative execution of straggling tasks; Unit testing Hadoop MapReduce applications using MRUnit Integration testing Hadoop MapReduce applications using MiniYarnClusterAdding a new DataNode; Decommissioning DataNodes; Using multiple disks/volumes and limiting HDFS disk usage; Setting the HDFS block size; Setting the file replication factor; Using the HDFS Java API; Chapter 4: Developing Complex Hadoop MapReduce Applications; Introduction; Choosing appropriate Hadoop data types; Implementing a custom Hadoop Writable data type; Implementing a custom Hadoop key type; Emitting data of different value types from a Mapper; Choosing a suitable Hadoop InputFormat for your input data format Adding support for new input data formats -- implementing a custom InputFormatFormatting the results of MapReduce computations -- using Hadoop OutputFormats; Writing multiple outputs from a MapReduce computation; Hadoop intermediate data partitioning; Secondary sorting -- sorting Reduce input values; Broadcasting and distributing shared resources to tasks in a MapReduce job -- Hadoop DistributedCache; Using Hadoop with legacy applications -- Hadoop Streaming; Adding dependencies between MapReduce jobs; Hadoop counters for reporting custom metrics; Chapter 5: Analytics; Introduction … (more)
- Publisher Details:
- Birmingham, UK : Packt Publishing
- Publication Date:
- 2015
- Extent:
- 1 online resource (1 volume), illustrations
- Subjects:
- 005.74
COMPUTERS -- Data Processing
Cluster analysis -- Data processing
Electronic data processing -- Distributed processing
Apache Hadoop (Computer file)
Electronic data processing--Distributed processing
File organization (Computer science)
Open source software
Cluster analysis -- Data processing
Electronic data processing -- Distributed processing
COMPUTERS -- Databases -- General
Electronic books
Electronic books - Languages:
- English
- ISBNs:
- 9781783285488
1783285486
1783285478
9781783285471 - Related ISBNs:
- 9781783285471
- Notes:
- Note: Description based on print version record.
- Access Rights:
- Legal Deposit; Only available on premises controlled by the deposit library and to one user at any one time; The Legal Deposit Libraries (Non-Print Works) Regulations (UK).
- Access Usage:
- Restricted: Printing from this resource is governed by The Legal Deposit Libraries (Non-Print Works) Regulations (UK) and UK copyright law currently in force.
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library HMNTS - ELD.DS.87458
- Ingest File:
- 01_086.xml