Warden determines the percentage of resources available for MapReduce v1 jobs and applications based on the warden.conf file. Applications include MapReduce v2 and non-MapReduce applications such as Spark.
Note: If you modify the values in warden.conf, you must restart Warden.

The percent of resources allocated for YARN and MapReduce v1 jobs is based on the values of the following parameters in warden.conf:

ParameterDefaultDescription
mr1.memory.percent50The percentage of memory allocated to MapReduce v1 jobs.The remaining memory is allocated to applications.
mr1.cpu.percent50The percentage of CPUs allocated to MapReduce v1 jobs.The remaining CPUs are allocated to applications.
mr1.disk.percent50The percentage of disks allocated to MapReduce v1 jobs.The remaining disks are allocated to applications.

These values only apply when TaskTracker and NodeManager roles are installed on a node.  For example, if TaskTracker is not installed on the node, NodeManager will get 100% of the resources available to process applications regardless of the warden.conf settings. Similarly, if NodeManager is not installed on the node, TaskTracker will get 100% of the resources available to process MapReduce jobs regardless of the warden.conf settings.

This section includes the following topics:

YARN Container Resources

A YARN application can be a MapReduce v2 application or a non-MapReduce application. The Warden on each node calculates the resources that can be allocated to process YARN applications. Each application has an Application Master that negotiates YARN container resources. For MapReduce applications, YARN processes each map or reduce task in a container.
The Application Master requests resources from the Resource Manager based on memory, CPU, and disk requirements for the YARN containers. For YARN containers that process MapReduce v2 tasks, there are additional considerations. See YARN Container Resource Allocation for MapReduce v2 Applications for details.
The Application Master requests YARN container resources based on the values of the following parameters:

ParameterDefaultDescription
yarn.scheduler.minimum-allocation-mb1024

Defines the minimum memory allocation available for a container in MB.

To change the value, edit the yarn-site.xml file for the node that runs the ResourceManager. Assign the new value to this property, then restart the ResourceManager.

yarn.scheduler.maximum-allocation-mb 8192Defines the maximum memory allocation available for a container in MB

To change the value, edit the yarn-site.xml file for the node that runs the ResourceManager. Assign the new value to this property, then restart the ResourceManager.

yarn.nodemanager.resource.memory-mbVariable. This value is calculated by Warden.Defines the memory available to processing Yarn containers on the node in MB.

Warden uses the following formula to calculate this value: [total physical memory on node] – [memory required by the operating system, MapR-FS, and MapR services installed on the node]-[memory allocated to MapReduce v1 jobs, if TaskTracker is installed on the node].

To determine the value, go to the ResourceManager UI and view the memory available for that node.

yarn.nodemanager.resource.cpu-vcores

Variable. This value is calculated by Warden.

Defines the number of CPUs available to process YARN containers on this node.

Warden uses the following formula to calculate this value: [# CPU cores on node] – [# of CPU cores assigned to Mapr-FS]-[# of CPU cores assigned to MapReduce v1 jobs, if Task Tracker is installed on the node].

To determine the value, go to the ResourceManager UI or the YARN pane on the MCS and view the number of CPUs available for that node.

To change the value, edit the yarn-site.xml file for the node, assign the new value to this property, then restart the NodeManager.

yarn.nodemanager.resource.io-spindles

Variable. This value is calculated by Warden.

Defines the number of disks available to process YARN containers. Warden uses the following formula to calculate this value: [# of disk on the node] – [# of disks assigned to process MapReduce v1 jobs].

To determine the value, go to the ResourceManager UI or the YARN pane in the MCS and view the disk information for this node.

YARN Container Resources for MapReduce v2 Applications

In addition to the YARN container resource allocation parameters, the MapReduce ApplicationMaster also considers the following container requirements when it sends requests to the ResourceManager for containers to run MapReduce jobs:

ParameterDefaultDescription

mapreduce.map.memory.mb

1024Defines the container size for map tasks in MB.
mapreduce.reduce.memory.mb3072

Defines the container size for reduce tasks in MB.

mapreduce.reduce.java.opts-Xmx2560mJava options for reduce tasks.
mapreduce.map.java.opts-Xmx900mJava options for map tasks.

mapreduce.map.disk

0.5

Defines the number of disks a map task requires.
For example, a node with 4 disks can run 8 map tasks at a time.
Note: If I/O intensive tasks do not run on the node, you may want to change this value.

mapreduce.reduce.disk

1.33

Defines the number of disks that a reduce task requires.
For example, a node with 4 disks can run 3 reduce tasks at a time.
Note: If I/O intensive tasks do not run on the node, you might want to change this value.

You can use one of the following methods to change the default configuration:

MapReduce v1 Job Resource Allocation

When a MapReduce v1 job is submitted to JobTracker, JobTracker determines which TaskTracker nodes can process the map and reduce tasks based on the available map and reduce slots. Map and reduce slots are allocated based on the memory available to process MapReduce V1 jobs, and the number of CPUs and Disks available to MapR-FS.
In general, you should not need to customize the number of map and reduce slots. However, you can configure the parameters that are used to calculate the values. For more information, see Customizing the MapReduce v1 Slot Calculation Parameters.

Criteria for Map Slot Calculation

MapR Hadoop sets the number of map slots to the lowest value that results from the following memory, CPU, and disk calculations:

Criteria for Reduce Slot Calculation

MapR Hadoop sets the number of reduce slots to the lowest value that results from the following memory, CPU, and disk calculations:

Example Map and Reduce Slot Calculation

In the following example, the node has the following configuration:

Node Resources or SettingsValues
Services and OptionsTaskTracker, MapR-FS, MapR-DB
CPU/Core24
Disks Available to MapR-FS5
RAM48G
Chunk Size256MB

Based on this configuration, MapR Hadoop performs the following calculations to determine the number of map and reduce slots: 

CalculationValueDescription
Number of CPUs4Since MapR-DB is running, 4 CPUs are used to determine the slot calculation.
Memory for Map Slots1G

Since the chunk size is 256, 1G is allocated to memory for map slots.
Warden sets mapred.job.map.memory.physical.mb to 1000MB.

Memory for Reduce Slots

3G

Since the chunk size is 256, 3G is allocated to memory for map slots.
Warden sets mapred.job.reduce.memory.physical.mb to 3000MB.

Memory available to process MapReduce V1 tasks26G

Based on the services running on the node, Warden calculates the memory available to process MapReduce v1 tasks.
In this example, Warden sets mapreduce.tasktracker.reserved.physicalmemory.mb to 26000MB.
For more information, see Memory Allocation for Nodes.

Map Slots10

This value is based on the following calculations:

= Min [ (0.4 * mapreduce.tasktracker.reserved.physicalmemory.mb)/ mapred.job.map.memory.physical.mb, (CPU > 2 ? 2 * (CPU - MapRFSCPU) : 1),  2 * MapRFSdisks
=  Min[ (0.4*26)/1, 2* 23-4), 2*5]
=  Min [ 10.4, 38, 10]
= 10

In this example, Warden sets mapred.tasktracker.map.tasks.maximum to 10.
Reduce Slots3

This value is based on the following calculation:
= Min [(0.6 * mapreduce.tasktracker.reserved.physicalmemory.mb)/ mapred.job.reduce.memory.physical.mb, (CPU > 2 ? CPU - MapRFSCPU : 1), (MapRFSDISKS > 2 ? 0.75 * MapRFSDISKS : 1) ]
= Min[ (0.6*26)/3, 23-4, 0.75*5]
= Min [5.2, 19, 3.75]
= 3
In this example, Warden sets mapred.tasktracker.reduce.tasks.maximum to 3.

Customizing the MapReduce v1 Slot Calculation Parameters

In general, you should not need to customize the number of map and reduce slots because Warden determines these value based on the resource available on the node.
However, you can override the number of slots by adding one or more of these parameters to mapred-site.xml. The mapred-site.xml file for MapReduce v1 jobs is in the following location: /opt/mapr/hadoop/hadoop-0.20.2/conf/mapred-site.xml.

Note: If you make changes to mapred-site.xml, you must restart TaskTracker.
Warden uses the following parameters to calculate and assign values to map slots and reduce slots on each node:

ParameterDefault ValueDescription


mapreduce.tasktracker.reserved.physicalmemory.mb

Warden uses the following formula to calculate this value: [total physical memory on node] – [memory required by the operating system, MapR file system and MapR services installed on the node]-[memory allocated to YARN applications, if Node Manager is installed on the node].
For more information, see Memory Allocation for Nodes.
To determine the value, go to the TaskTracker UI and view the memory available for that node.

Defines the memory available to process MapReduce v1 tasks in MB.
mapred.tasktracker.map.tasks.maximumWarden uses a formula to calculate this value. For more information, see Criteria for Map Slot Calculation.Defines the maximum number of MapReduce v1 map slots.
mapred.tasktracker.reduce.tasks.maximumWarden uses a formula to calculate this value. For more information, see Criteria for Reduce Slot Calculation.Defines the maximum number of MapReduce v1 map slots.
mapred.job.map.memory.physical.mb

If the chunk size is greater than or equal to 256M, then this value is set to 1G. Otherwise, this value is set to 0.5G.
For information about configuring the chunk size, see Chunk Size.

Defines the amount of memory allocated to map tasks in MB.

mapred.job.reduce.memory.physical.mb

If the chunk size is greater than or equal to 256M, then this value is set to 3G. Otherwise, this value is set to 1.5G.
For information about configuring the chunk size, see Chunk Size.

Defines the amount of memory allocated to reduce tasks in MB.