This is documentation for MapR Version 5.0. You can also refer to MapR documentation for the latest release.

Skip to end of metadata
Go to start of metadata

Here are a few examples that show how to configure nodes based on different hardware profiles. At install time, MapR sets up an initial number of map and reduce slots based on a slot calculation algorithm that takes into account the amount of memory and the number of cores present on each node. For more information, see the mapred-site.xml documentation.

Example 1

On this example 10-node Enterprise Edition cluster, each node has the following characteristics:

  • 16 cores
  • 32 GB memory
  • 12 drives

Although 16 cores theoretically can run up to 28 map slots (2 each on 14 cores, with 2 cores reserved for the operating system and MapR-FS), there would be no memory left for reduce slots. After the operating system and MapR-FS take their share of the memory, there is approximately 24 GB left for MapReduce.

  • Set the chunk size to 256 MB (unless you are using application-level compression).
  • For that chunk size, io.sort.mb is 380 MB.
  • Set map task memory to 800 MB by adding -Xmx800m to mapred.map.child.java.opts.
  • 14 map slots (11.2 GB memory required)
  • Use the rest of memory for reducers:
    • 24 GB - 11.2 GB = 12.8 GB
    • 12.8 GB / 3.5 GB = approximately 4 reducers 

To improve the ratio of mappers to reducers, consider 10 mappers and 5 reducers instead.

Example 2

In this example 10-node cluster, each node has the following characteristics:

  • 16 cores
  • 32 GB memory
  • 4 drives

This hardware has a small number of drives; with limited disk I/O capacity, run a conservative number of map and reduce slots: 

  • Not more than 2 or 3 reduce slots
  • Not more than 8 map slots
  • Give the leftover 6G memory to MapR-FS

Example 3

On this example 10-node cluster, each node has the following characteristics:

  • 16 cores
  • 128 GB memory
  • 12 drives

This cluster is similar to Example 1, but less memory-constrained. It is important to be conservative with the number of map and reduce slots to avoid flooding the cluster with random disk I/O:

  • 24 map slots
  • 9 reduce slots
  • Give the leftover 41 GB to MapR-FS

Example 4

On this example 10-node cluster, each node has the following characteristics:

  • 8 cores
  • 12 GB memory
  • 3 drives

This hardware is resource-constrained across the board. Reduce the chunk size to improve parallelism, use less memory for the operating system, and set a small number of mappers and reducers:

  • Set chunk size to 128 MB
  • Set io.sort.mb to 190 MB
  • 4 map slots
  • 2 reduce slots
  • Give any leftover memory to MapR-FS

Example 5

On this example 1000-node cluster, each node has the following characteristics:

  • 4 cores
  • 32 GB memory
  • 2 drives

This hardware has too few cores to accomplish much parallelism, but has a lot of memory. Because the hardware is memory-heavy and core-light, give as much memory as possible to MapR-FS; the map output is likely to fit in memory, reducing disk I/O.

  • 4 map slots
  • 1 reduce slot
  • Chunk size: 256 MB
  • Leftover 16 GB memory given to MapR-FS
  • Set mapred.reduce.slowstart.completed.maps to 0
  • No labels