This is documentation for MapR Version 5.0. You can also refer to MapR documentation for the latest release.

Skip to end of metadata
Go to start of metadata

The FairScheduler is a pluggable scheduler for Hadoop that allows YARN applications to share resources in a large cluster fairly.

What is Fair Scheduling?

Fair scheduling is a method of assigning resources to applications such that all applications get, on average, an equal share of resources over time. Hadoop 2.x is capable of scheduling multiple resource types.

By default, the Fair Scheduler bases scheduling fairness decisions only on memory. It can be configured to schedule resources based on memory and CPU. When only one application is running, that application uses the entire cluster. When other applications are submitted, resources that free up are assigned to the new applications, so that each application eventually gets approximately the same amount of resources. Unlike the default Hadoop scheduler, which forms a queue of applications, this lets short applications finish in reasonable time while not starving long-lived applications. It is also a reasonable way to share a cluster between a number of users. Finally, fair sharing also uses priorities applied as weights to determine the fraction of total resources that each application should get.

Scheduling Queues

The scheduler organizes applications further into queues, and shares resources fairly between these queues. By default, all users share a single queue, named default. If an application specifically lists a queue in a container resource request, the request is submitted to that queue. You can also assign queues based on the user name included with the request through configuration. Within each queue, a scheduling policy is used to share resources between the running applications. The default is memory-based fair sharing, but FIFO and multi-resource with Dominant Resource Fairness can also be configured.

Queues can be arranged in a hierarchy to divide resources, and they can be configured with weights to share the cluster in specific proportions. The Fair Scheduler uses a concept called a queue path to configure a hierarchy of queues. The queue path is the full path of the queue's hierarchy, starting at root. The following example has three top-level child-queues a, b, and c and some sub-queues for a and b:

In addition to providing fair sharing, the Fair Scheduler allows assigning guaranteed minimum shares to queues, which is useful for ensuring that certain users, groups or production applications always get sufficient resources. When a queue contains apps, it gets at least its minimum share, but when the queue does not need its full guaranteed share, the excess is split between other running apps. This lets the scheduler guarantee capacity for queues while utilizing resources efficiently when these queues do not contain applications.

Configuring the Fair Scheduler

The Fair Scheduler lets all applications run by default, but you can also limit the number of running applications per user and per queue through the configuration file. This can be useful when a user must submit hundreds of applications at once, or in general to improve performance if running too many applications at once would cause too much intermediate data to be created or too much context-switching. Limiting the applications does not cause any subsequently submitted applications to fail; it only causes them to wait in the scheduler's queue until earlier applications finish.

To customize the Fair Scheduler, set configuration properties in yarn-site.xml and update the allocation file to list existing queues and their respective weights and capacities. The allocation file is automatically created during MapR installation, The default location for this file is:

The file is reloaded every 10 seconds to refresh the scheduler with any modified settings that are specified in the file.

Specifying Fair Scheduler Configuration Properties in yarn-site.xml

The yarn-site.xml file contains parameters that determine scheduler-wide options. These properties include:

ParameterDefault ValueDescription
yarn.scheduler.fair.allocation.filefair-scheduler.xml

Specifies the path to the allocation file. If a relative path is given, the file is searched for on the classpath.

yarn.scheduler.fair.user-as-default-queuetrueIndicates whether to use the username associated with the allocation file as the default queue name, if a queue name is not specified.
Note: If a queue placement policy is given in the allocations file, this property is ignored. 
yarn.scheduler.fair.preemptionfalseIndicates whether to use preemption.
yarn.scheduler.fair.sizebasedweightfalseIndicates whether to assign shares to individual applications based on their size, rather than providing an equal share to all applications regardless of size. When set to true, applications are weighted by (ln 1 + <application's total requested memory>)/ ln 2.
yarn.scheduler.fair.assignmultiplefalseIndicates whether to allow multiple container assignments in one heartbeat.

 

Assigning Weights in an Allocation File

An allocation file is an XML manifest that describes queues and their properties, as well as certain policy defaults. The format contains these types of elements:

Queue Elements

Queue elements can contain the following properties:

Property NameDescription
minResourcesMinimum resources the queue is entitled to, in the form X mb, Y vcores, Z disks. For the single-resource fairness policy, the vcores and disks values are ignored.
If a queue's minimum share is not satisfied, it will be offered available resources before any other queue under the same parent.
Under the single-resource fairness policy, a queue is considered unsatisfied if its memory usage is below its minimum memory share.
Under dominant resource fairness, a queue is considered unsatisfied if its usage for its dominant resource with respect to the cluster capacity is below its minimum share for that resource. If multiple queues are unsatisfied in this situation, resources go to the queue with the smallest ratio between relevant resource usage and minimum. Note that a queue that is below its minimum may not immediately get up to its minimum when it submits an application, because jobs that are already running may be using those resources.
maxResourcesMaximum resources a queue is allowed, in the form X mb, Y vcores, Z disks. For the single-resource fairness policy, the vcores and disks values are ignored. A queue is never assigned a container that would put its aggregate usage over this limit.
maxRunningApps

The limit for the number of applications that can run at once for the queue and any of its child queues.
If the value of maxRunningApps for any parent queue in the queue path is lower than the value that you set for a queue, the lowest value sets the application limit.

Icon
The queueMaxAppsDefault value is used for any parent queue that does not set a value for the maxRunningApps.


weightApplies a weight to share the cluster non-proportionally with other queues. Default is 1.
A queue with weight 2 should receive approximately twice as many resources as a queue with the default weight.
schedulingPolicySets the scheduling policy of any queue.
Allowed values are fifo, fair, drf or any class that extends org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.SchedulingPolicy.
The default is fair.
If set to fifo, applications with earlier submit times are given preference for containers, but applications submitted later may run concurrently if the cluster has leftover space after satisfying the earlier application's requests.
aclSubmitAppsA list of users and/or groups that can submit applications to the queue. Refer to the ACLs section below for more information on the format of this list and how queue ACLs work.
aclAdministerAppsA list of users and/or groups that can administer a queue. Currently, the only administrative action is killing an application. Refer to the ACLs section below for more information on the format of this list and how queue ACLs work.
minSharePreemptionTimeout The number of seconds the queue is under its minimum share before it tries to preempt containers to take resources from other queues.

User Elements

User elements represent settings that govern the behavior of individual users. They can contain a single property: maxRunningApps, which limits the number of running applications for a particular user.

userMaxAppsDefault Element

The userMaxAppsDefault element sets the default running application limit for any users whose limit is not otherwise specified.

queueMaxAppsDefault Element

The queueMaxAppsDefault element sets the default running application limit for queues whenever maxRunningApps is not set for that queue. 

Icon
If you set a value for queueMaxAppsDefault and do not set a value for maxRunningApps for the root queue, the value of queueMaxAppsDefault sets the application limit for all queues under the root queue hierarchy.

fairSharePreemptionTimeout Element

 The fairSharePreemptionTimeout element sets the number of seconds a queue is under its fair share before it tries to preempt containers to take resources from other queues.

 defaultQueueSchedulingPolicy element

The defaultQueueSchedulingPolicy element sets the default scheduling policy for queues. This element is overridden by the schedulingPolicy element in each queue if it is specified. The default is fair.

 queuePlacementPolicy Element

The queuePlacementPolicy element contains a list of rule elements that tell the scheduler how to place incoming applications into queues. Rules are applied in the order that they are listed. Rules can take arguments. All rules accept the create argument, which indicates whether the rule can create a new queue. The create argument defaults to true; if set to false and the rule would place the application in a queue that is not configured in the allocation file, the scheduler continues on to the next rule. The last rule must be one that can never issue a continue. Valid rules are:

  • specified
    The application is placed into the queue it requested. If the application did not request a queue (it specified default), continue to the next rule.
  • user
    The application is placed into a queue with the name of the user who submitted it.
  • primaryGroup
    The application is placed into a queue with the name of the primary group of the user who submitted it.
  • secondaryGroupExistingQueue
    The application is placed into a queue with a name that matches a secondary group of the user who submitted it. The first secondary group that matches a configured queue will be selected.
  • default
    The application is placed into the queue named default.
  • reject
    The application is rejected.

Example Allocation File

Icon

For backward compatibility with the original FairScheduler, queue elements can instead be named as pool elements.

Queue Access Control Lists

Queue Access Control Lists (ACLs) allow administrators to control who may take actions on particular queues. They are configured with the aclSubmitApps and aclAdministerApps properties, which can be set per queue. Currently the only supported administrative action is killing an application. Anyone who has permission to administer a queue may also submit applications to it. These properties take values in a format like user1,user2 group1,group2 or group1,group2. An action on a queue will be permitted if its user or group is in the ACL of that queue or in the ACL of any of that queue's ancestors. So if queue2 is inside queue1, and user1 is in queue1's ACL, and user2 is in queue2's ACL, then both users may submit to queue2.

The root queue's ACLs are "*" by default which, because ACLs are passed down, means that everyone may submit to and kill applications from every queue. To restrict access, change the root queue's ACLs to something other than "*".

By default, the yarn.admin.acl property in yarn-site.xml is also set to "*", which means any user can be the administrator. If queue ACLs are enabled, you also need to set the yarn.admin.acl property to the correct admin user for the YARN cluster. For example:

If you do not set this property correctly, users will be able to kill YARN jobs even when they do not have access to the queues for those jobs. 

  • No labels