A MapR Hadoop installation is usually a large-scale set of individual servers, called nodes, collectively called a cluster. In a typical cluster, most nodes are dedicated to data processing and storage, and a smaller number of nodes run other services that provide cluster coordination and management.
The first step in deploying MapR is planning which servers will form the cluster, and selecting the services that will run on each node. To determine whether a server is capable of contributing to the cluster, it may be necessary to check the requirements found in Step 2, Preparing Each Node. Each node in the cluster must be carefully checked against these requirements; unsuitability of a node is one of the most common reasons for installation failure.
The objective of Step 1 is a Cluster Plan that details each node's set of services. The following sections help you create this plan:
- Unique Features of the MapR Distribution
- Select Services
- Cluster Design Objectives
- Cluster Hardware
- Service Layout in a Cluster
- User Accounts
- Next Step
Unique Features of the MapR Distribution
Administrators who are familiar with ordinary Apache Hadoop will appreciate the MapR distribution's real-time read/write storage layer. MapR APIs are 100% compliant with HDFS while eliminating the Namenodes, which is a single point of failure. Furthermore, MapR utilizes raw disks and partitions without RAID or Logical Volume Manager, greatly improving performance. Many Hadoop installation documents discuss considerations around HDFS and Namenodes, while MapR Hadoop's solution eliminates the guesswork, making it simpler to install.
The MapR Filesystem (MapR-FS) stores data in volumes, which are logical partitions of the filesystem. Each volume is made up of one or more data containers, which hold the files associated with a volume, and a metadata container that stores information about those files. By holding metadata in a volume's container, the metadata distributes itself among all nodes in the cluster, making MapR-FS extremely scalable and resilient. The Container Location Database (CLDB) service runs across mutiple cluster nodes and provides a directory of container locations.
A process called Warden runs on all nodes to manage, monitor, and report on the services on each node. The MapR cluster uses Apache ZooKeeper to coordinate between services running across multiple nodes. ZooKeeper prevents service conflicts by enforcing a set of rules and conditions that determine which instance of each service is the master. Warden will not start any services unless ZooKeeper is reachable and more than half of the configured ZooKeeper nodes (a quorum) are live.
MapR also provides native table storage, called MapR-DB. The MapR HBase Client is used to access table data via the open-standard Apache HBase API. MapR-DB simplifies and unifies administration for both structured table data and unstructured file data on a single cluster. If you plan to use MapR-DB exclusively for structured data, then you do not need to install the Apache HBase Master or RegionServer. However, Master and RegionServer services can be deployed on an MapR cluster if your applications require them, for example, during the migration period from Apache HBase to MapR-DB. The MapR HBase Client provides access to both Apache HBase tables and MapR-DB. MapR-DB is available in MapR's Community Edition (M3 license) and Enterprise Database Edition (M7 license).
The MapR Hadoop distribution is licensed in tiers.
The free Community Edition (M3 license) includes MapR innovations, such as the read/write MapR-FS, NFS access to the filesystem, and MapR-DB, but does not include the level of technical support offered with the Enterprise editions.
The Enterprise Edition (M5 license) enables enterprise-class storage features, such as snapshots and mirrors of individual volumes, and high-availability features, such as the ability to run NFS servers on multiple nodes which also improves bandwidth and performance.
If you need to store data in MapR-DB, choose the Enterprise Database Edition (M7 license). The Enterprise Database Edition includes all features of the Enterprise Edition (M5 license), and adds support for MapR-DB, a flexible NoSQL database that exposes the Apache HBase API.
Register online to obtain a Community Edition (M3 license) or a trial of the Enterprise Edition (M5 license). To obtain an Enterprise Database Edition (M7 license), you will need to contact a MapR representative.
In a typical cluster, most nodes are dedicated to data processing and storage, and a smaller number of nodes run services that provide cluster coordination and management. Some applications run on cluster nodes and others run on client nodes that can communicate with the cluster.
The services that you choose to run on each node will likely evolve over the life of the cluster. Services can be added and removed over time.
The following table shows some of the services that can be run on a node.
Warden runs on every node, coordinating the node's contribution to the cluster.
Hadoop TaskTracker starts and tracks MapReduce tasks on a node. The TaskTracker service receives task assignments from the JobTracker service and manages task execution.
|Hadoop YARN NodeManager service. The NodeManager manages node resources and monitors the health of the node. It works with the ResourceManager to manage YARN containers that run on the node.|
FileServer is the MapR service that manages disk storage for MapR-FS and MapR-DB on each node.
Maintains the container location database (CLDB) service. The CLDB service coordinates data storage services among MapR-FS FileServer nodes, MapR NFS gateways, and MapR clients.
Provides read-write MapR Direct Access NFS™ access to the cluster, with full support for concurrent read and write access.
Provides access to MapR-DB tables via HBase APIs. Required on all nodes that will access table data in MapR-FS, typically all TaskTracker nodes and edge nodes for accessing table data.
Hadoop JobTracker service. The JobTracker service coordinates the execution of MapReduce jobs by assigning tasks to TaskTracker nodes and monitoring task execution.
|Hadoop YARN ResourceManager service. The ResourceManager manages cluster resources, and tracks resource usage and node health.|
Enables high availability (HA) and fault tolerance for MapR clusters by providing coordination.
|Archives MapReduce job metrics and metadata.|
The HBase master service manages the region servers that make up HBase table storage.
Runs the MapR Control System.
Provides optional real-time analytics data on cluster and job performance through the Analyzing Job Metrics interface. If used, the Metrics service is required on all JobTracker and Web Server nodes.
|Hue is Hadoop user interface that interacts with Apache Hadoop and its ecosystem components, such as Hive, Pig, and Oozie.|
HBase region server is used with the HBase Master service and provides storage for an individual HBase region.
Pig is a high-level data-flow language and execution framework.
Hive is a data warehouse that supports SQL-like ad hoc querying and data summarization.
Flume is a service for aggregating large amounts of log data
Oozie is a workflow scheduler system for managing Hadoop jobs.
HCatalog aggregates HBase data.
Cascading is an application framework for analyzing and managing big data.
Mahout is a set of scalable machine-learning libraries that analyze user behavior.
|Spark is an processing engine for large datasets.|
Sqoop is a tool for transferring bulk data between Hadoop and relational databases.
MapR is a complete Hadoop distribution, but not all services are required. Every Hadoop installation requires services to manage jobs and applications. JobTracker and TaskTracker manage MapReduce v1 jobs. ResourceManager and NodeManager manage MapReduce v2 and other applications that can run on YARN. In addition, MapR requires the ZooKeeper service to coordinate the cluster, and at least one node must run the CLDB service. The WebServer service is required if the browser-based MapR Control System will be used.
MapR Hadoop includes tested versions of the services listed here. MapR provides a more robust, read-write storage system based on volumes and containers. MapR data nodes typically run FileServer, TaskTracker, and NodeManager. Do not plan to use packages from other sources in place of the MapR distribution.
Cluster Design Objectives
Begin by understanding the work that the cluster will perform. Establish metrics for data storage capacity, throughput, and characterize the data processing that will typically be performed.
While MapR is relatively easy to install and administer, designing and tuning a large production MapReduce cluster is a complex task that begins with understanding your data needs. Consider the kind of data processing that will occur and estimate the storage capacity and throughput speed required. Data movement, independent of MapReduce operations, is also a consideration. Plan for how data will arrive at the cluster, and how it will be made useful elsewhere.
Network bandwidth and disk I/O speeds are related; either can become a bottleneck. CPU-intensive workloads reduce the relative importance of disk or network speed. If the cluster will be performing a large number of big reduces, network bandwidth is important, suggesting that the hardware plan include multiple NICs per node. In general, the more network bandwidth, the faster things will run.
Running NFS on multiple data nodes can improve data transfer performance and make direct loading and unloading of data possible, but multiple NFS instances requires an Enterprise Edition license. For more information about NFS, see Setting Up MapR NFS.
Plan which nodes will provide NFS access according to your anticipated traffic. For instance, if you need 5Gb/s of write throughput and 5Gb/s of read throughput, the following node configurations would be suitable:
- 12 NFS nodes with a single 1GbE connection each
- 6 NFS nodes with dual 1GbE connections each
- 4 NFS nodes with quadruple 1GbE connections each
When you set up NFS on all of the file server nodes, you enable a self-mounted NFS point for each node. A cluster made up of nodes with self-mounted NFS points enable you to run native applications as tasks. You can use round-robin DNS or a hardware load balancer to mount NFS on one or more dedicated gateways outside the cluster to allow controlled access.
A properly licensed and configured MapR cluster provides automatic failover for continuity throughout the stack. Configuring a cluster for HA involves redundant instances of specific services, as well as a correct configuration of the MapR NFS service. HA features are not available with the Community Edition (M3 license).
The following describes redundant services used for HA:
Master/slave--two instances in case one fails
A majority of ZK nodes (a quorum) must be up
Active/standby--if the first JT fails, the backup is started
|ResourceManager||One active and one or more standby instances. If the active one fails, one standby instance takes over.||2|
Active/standby--if the first HBase Master fails, the backup is started. This is only a consideration when deploying Apache HBase on the cluster.
The more redundant NFS services, the better
On a large cluster, you may choose to have extra nodes available in preparation for failover events. In this case, you keep spare, unused nodes ready to replace nodes running control services--such as CLDB, JobTracker, ZooKeeper, or HBase Master--in case of a hardware failure.
Virtual IP Addresses
You can set up virtual IP addresses (VIPs) for NFS nodes in an Enterprise Edition-licensed MapR cluster, for load balancing or failover. VIPs provide multiple addresses that can be leveraged for round-robin DNS, allowing client connections to be distributed among a pool of NFS nodes. VIPs also enable high availability (HA) NFS. In a HA NFS system, when an NFS node fails, data requests are satisfied by other NFS nodes in the pool. Use a minimum of one VIP per NFS node per NIC that clients will use to connect to the NFS server. If you have four nodes with four NICs each, with each NIC connected to an individual IP subnet, use a minimum of 16 VIPs and direct clients to the VIPs in round-robin fashion. The VIPs should be in the same IP subnet as the interfaces to which they will be assigned. See Setting Up VIPs for NFS for details on enabling VIPs for your cluster.
If you plan to use VIPs on your Enterprise Edition cluster's NFS nodes, consider the following tips:
- Set up NFS on at least three nodes if possible.
- All NFS nodes must be accessible over the network from the machines where you want to mount them.
- To serve a large number of clients, set up dedicated NFS nodes and load-balance between them. If the cluster is behind a firewall, you can provide access through the firewall via a load balancer instead of direct access to each NFS node. You can run NFS on all nodes in the cluster, if needed.
- To provide maximum bandwidth to a specific client, install the NFS service directly on the client machine. The NFS gateway on the client manages how data is sent in or read back from the cluster, using all its network interfaces (that are on the same subnet as the cluster nodes) to transfer data via MapR APIs, balancing operations among nodes as needed.
- Use VIPs to provide High Availability (HA) and failover.
When planning the hardware architecture for the cluster, make sure all hardware meets the node requirements listed in Preparing Each Node.
The architecture of the cluster hardware is an important consideration when planning a deployment. Among the considerations are anticipated data storage and network bandwidth needs, including intermediate data generated during MapReduce job execution. The type of workload is important: consider whether the planned cluster usage will be CPU-intensive, I/O-intensive, or memory-intensive. Think about how data will be loaded into and out of the cluster, and how much data is likely to be transmitted over the network.
Planning a cluster often involves tuning key ratios, such as: disk I/O speed to CPU processing power; storage capacity to network speed; or number of nodes to network speed.
Typically, the CPU is less of a bottleneck than network bandwidth and disk I/O. To the extent possible, network and disk transfer rates should be balanced to meet the anticipated data rates using multiple NICs per node. It is not necessary to bond or trunk the NICs together; MapR is able to take advantage of multiple NICs transparently. Each node should provide raw disks and partitions to MapR, with no RAID or logical volume manager, as MapR takes care of formatting and data protection.
The following example architecture provides specifications for a standard compute/storage node for general purposes, and two sample rack configurations made up of the standard nodes. MapR is able to make effective use of more drives per node than standard Hadoop, so each node should present enough face plate area to allow a large number of drives. The standard node specification allows for either 2 or 4 1Gb/s ethernet network interfaces. MapR recommends 10Gb/s network interfaces for high-performance clusters.
Standard 50TB Rack Configuration
- 10 standard compute/storage nodes
(10 x 12 x 2 TB storage; 3x replication, 25% margin)
- 24-port 1 Gb/s rack-top switch with 2 x 10Gb/s uplink
- Add second switch if each node uses 4 network interfaces
Standard 100TB Rack Configuration
- 20 standard nodes
(20 x 12 x 2 TB storage; 3x replication, 25% margin)
- 48-port 1 Gb/s rack-top switch with 4 x 10Gb/s uplink
- Add second switch if each node uses 4 network interfaces
To grow the cluster, just add more nodes and racks, adding additional service instances as needed. MapR rebalances the cluster automatically.
Service Layout in a Cluster
How you assign services to nodes depends on the scale of your cluster and the MapR license level. For a single-node cluster, no decisions are involved. All of the services you are using run on the single node. On medium clusters, the performance demands of the CLDB and ZooKeeper services requires them to be assigned to separate nodes to optimize performance. On large clusters, good cluster performance requires that these services run on separate nodes.
The cluster is flexible and elastic---nodes play different roles over the lifecycle of a cluster. The basic requirements of a node are not different for management or for data nodes.
As the cluster size grows, it becomes advantageous to locate control services (such as ZooKeeper and CLDB) on nodes that do not run compute services (such as TaskTracker). The MapR Community Edition (M3 license) does not include HA capabilities, which restricts how many instances of certain services can run. The number of nodes and the services they run will evolve over the life cycle of the cluster.
The architecture of MapR software allows virtually any service to run on any node, or nodes, to provide a high-availability, high-performance cluster. Below are some guidelines to help plan your cluster's service layout.
In a production MapR cluster, some nodes are typically dedicated to cluster coordination and management, and other nodes are tasked with data storage and processing duties. An edge node provides user access to the cluster, concentrating open user privileges on a single host. In smaller clusters, the work is not so specialized and a single node may perform data processing as well as management.
Nodes Running ZooKeeper and CLDB
High latency on a ZooKeeper node can lead to an increased incidence of ZooKeeper quorum failures. A ZooKeeper quorum failure occurs when the cluster finds too few copies of the ZooKeeper service running. If the ZooKeeper node is also running other services, competition for computing resources can lead to increased latency for that node. If your cluster experiences issues relating to ZooKeeper quorum failures, consider reducing or eliminating the number of other services running on the ZooKeeper node.
Nodes for Data Storage and Processing
Most nodes in a production cluster are data nodes. Data nodes can be added or removed from the cluster as requirements change over time.
FileServer, TaskTracker and NodeManager run on data nodes. You may want to tune TaskTracker for fewer slots on nodes that include both management and data services. For more information, see Resource Allocation for Jobs and Applications.
So-called Edge nodes provide a common user access point for the MapR webserver and other client tools. Edge nodes may or may not be part of the cluster, as long as the edge node can reach cluster nodes. Nodes on the same network can run client services, MySQL for Metrics, and so on.
Node Requirements for Table Replication
If you are using table replication, you need to designate nodes on the destination cluster as gateway nodes. You have to install the
mapr-gateway package on these nodes. See MapR Gateways.
You also need to install the HBase 0.98.9 client on your web server nodes (
mapr-hbase); otherwise, table replication features will not work in the MCS.
Service Layout Guidelines for Large Clusters
The following are guidelines about which services to separate on large clusters:
- JobTracker and ResourceManager on ZooKeeper nodes: Avoid running the JobTracker and ResourceManager service on nodes that are running the ZooKeeper service. On large clusters, the JobTracker and ResourceManager services can consume significant resources.
- MySQL on CLDB nodes: Avoid running the MySQL server that supports the MapR Metrics service on a CLDB node. Consider running the MySQL server on a machine external to the cluster to prevent the MySQL server’s resource needs from affecting services on the cluster.
- TaskTracker on CLDB or ZooKeeper nodes: When the TaskTracker service is running on a node that is also running the CLDB or ZooKeeper services, consider reducing the number of task slots that this node's instance of the TaskTracker service provides.
- Webserver on CLDB nodes: Avoid running the webserver on CLDB nodes. Queries to the MapR Metrics service can impose a bandwidth load that reduces CLDB performance.
- JobTracker: Run the JobTracker services on dedicated nodes for clusters with over 250 nodes.
- ResourceManager: Run the ResourceManager services on dedicated nodes for clusters with over 250 nodes.
Example Cluster Designs
You can design your cluster in one of the following modes:
- MapReduce Classic: All nodes in the cluster run MapReduce v1
- YARN: All nodes in the cluster run YARN (MapReduce v2 and other applications that can run on YARN) .
- Mixed-Mode: Nodes in the cluster can run YARN or MapReduce v1
Small Community Edition Cluster
For a small cluster using the Community Edition (M3 license), assign the CLDB, JobTracker, NFS, and WebServer services to one node each. A hardware failure on any of these nodes would result in a service interruption, but the cluster can be recovered. Assign the ZooKeeper service to the CLDB node and two other nodes. Assign the FileServer and TaskTracker services to every node in the cluster.
Example Service Configuration for a 5-Node Community Edition Cluster- Runs MapReduce Classic (MapReduce v1)
This cluster has several single points of failure, at the nodes with CLDB, JobTracker and NFS.
Small High-Availability Enterprise Edition Cluster
A small Enterprise Edition cluster can ensure high availability (HA) for all services by providing at least two instances of each service, eliminating single points of failure.
Example Service Configuration for a 5-Node Enterprise Edition Cluster- Runs MapReduce Classic (MapReduce v1)
The example below depicts a 5-node, high-availabilty Enterprise Edition (M5 license) cluster with HBase installed. ZooKeeper is installed on three nodes. CLDB, JobTracker, and HBase Master services are installed on two nodes each, spread out as much as possible across the nodes:
This example put CLDB and ZooKeeper services on the same nodes and places JobTracker services on other nodes, but this is somewhat arbitrary. The JobTracker service can coexist on the same node as ZooKeeper or CLDB services.
Example Service Configuration for a 10-Node Enterprise Edition Cluster- Runs in Mixed-Mode (both MapReduce v1 and YARN)
The example below depicts a 10-node, high-availability Enterprise Edition (M5 license) cluster that can run both MapReduce v1 jobs and YARN applications. ZooKeeper, CLDB, JobTracker, and ResourceManager is installed on three nodes:
Large High-Availability Enterprise Edition Cluster
On a large cluster designed for high availability (HA), assign services according to the examples below, which depict 150-node Enterprise Edition (M5 license) clusters.
Example Service Configuration for a 100+ Node Enterprise Edition Cluster - Runs MapReduce Classic (MapReduce v1)
In this example, the majority of nodes are dedicated to the TaskTracker service. ZooKeeper, CLDB, and JobTracker are installed on three nodes each. The NFS server is installed on most machines, providing high network bandwidth to the cluster.
Example Service Configuration for a 100+ Node Enterprise Edition Cluster - Runs in Mixed-Mode (both MapReduce v1 and YARN)
In this example, the majority of nodes are dedicated to the TaskTracker and NodeManager services. ZooKeeper, CLDB, JobTracker, and ResourceManager are installed on three nodes each. The NFS server is installed on most machines, providing high network bandwidth to the cluster.
Plan Initial Volumes
MapR manages the data in a cluster in a set of volumes. Volumes can be mounted in the Linux filesystem in a hierarchical directory structure, but volumes do not contain other volumes. Each volume has its own policies and other settings, so it is important to define a number of volumes in order to segregate and classify your data.
Plan to define volumes for each user, for each project, and so on. For streaming data, you might plan to create a new volume to store new data every day or week or month. The more volume granularity, the easier it is to specify backup or other policies for subsets of the data. For more information on volumes, see Managing Data with Volumes.
Part of the cluster plan is a list of authorized users of the cluster. It is preferable to give each user an account, because account-sharing makes administration more difficult. Any user of the cluster must be established with the same Linux UID and GID on every node in the cluster. Central directory services, such as LDAP, are often used to simplify user maintenance.
It is important to begin installation with a complete Cluster Plan, but plans should not be immutable. Cluster services often change over time, particularly as clusters scale up by adding nodes. Balancing resources to maximize utilization is the goal, and it will require flexibility.
The next step is to prepare each node. Most installation difficulties are traced back to nodes that are not qualified to contribute to the cluster, or which have not been properly prepared. For large clusters, it can save time and trouble to use a configuration management tool such as Puppet or Chef.
Proceed to Preparing Each Node and assess each node.