The MapR NFS service lets you access data on a licensed MapR cluster via the NFS protocol:
- Community Edition license: one NFS node allows you to access your cluster as a standard POSIX-compliant filesystem.
- Enterprise Edition or Enterprise Database Edition license: multiple NFS servers allow each node to mount its own MapR-FS on NFS enabled with VIPs for high availability (HA) and load balancing
You can mount the MapR cluster via NFS and use standard shell scripting to read and write live data in the cluster. NFS access to cluster data can be faster than accessing the same data with the
hadoop commands. To mount the cluster via NFS from a client machine, see Accessing Data with NFS.
Before You Start: NFS Setup Requirements
Make sure the following conditions are met before using the MapR NFS gateway:
- The stock Linux NFS service must not be running. Linux NFS and MapR NFS cannot run concurrently.
- The lock manager (nlockmgr) must be disabled.
- On Red Hat and CentOS v6.0 and higher, the
rpcbindservice must be running. You can use the command
ps ax | grep rpcbindto check.
- On Red Hat and CentOS v5.x and lower, and on Ubuntu and SUSE, the
portmapperservice must be running. You can use the command
ps ax | grep portmapto check.
mapr-nfspackage must be present and installed. You can list the contents in the
/opt/mapr/rolesdirectory to check for
nfsin the list.
- Make sure you have applied a Community Edition (M3) license or an Enterprise Edition (M5) license (paid or trial) to the cluster. See Adding a License.
- Make sure the MapR NFS service is started (see Managing Roles and Services).
- Verify that the primary group of the user listed for
mapr.daemon.group. Restart Warden after any changes to
For information about mounting the cluster via NFS, see Accessing Data with NFS.
For information on upgrading your cluster, see Upgrade Guide.
Handling Heavy Write Loads on RHEL5
If you are operating on RHEL5 and have a heavy NFS write load, you might experience resource contention between the NFS client and the NFS server. This resource contention can cause the NFS server to be unresponsive. To avoid this potential problem, try one of these approaches:
/etc/sysctl.confand apply these settings on each NFS server:
Reboot the server so the changes will take effect. To make the settings take effect immediately, issue the
echocommand as shown:
- Separate the NFS client from the NFS server so they do not compete for memory on the same system.
NFS on a Community Edition Cluster
At installation time, choose one node on which to run the NFS gateway. NFS is lightweight and can be run on a node running services such as CLDB or ZooKeeper. To add the NFS service to a running cluster, use the instructions in Adding Roles to a Node to install the
mapr-nfs package on the node where you would like to run NFS.
NFS on an Enterprise Edition Cluster
At cluster installation time, plan which nodes should provide NFS access according to your anticipated traffic. For instance, if you need 5Gbps of write throughput and 5Gbps of read throughput, here are a few ways to set up NFS:
- 12 NFS nodes, each of which has a single 1Gbe connection
- 6 NFS nodes, each of which has a dual 1Gbe connection
- 4 NFS nodes, each of which has a quad 1Gbe connection
You can also set up NFS on all file server nodes to enable a self-mounted NFS point for each node. Self-mounted NFS for each node in a cluster enables you to run native applications as tasks. You can mount NFS on one or more dedicated gateways outside the cluster (using round-robin DNS or behind a hardware load balancer) to allow controlled access.
NFS and Virtual IP addresses
You can set up virtual IP addresses (VIPs) for NFS nodes in an Enterprise Edition-licensed MapR cluster, for load balancing or failover. VIPs provide multiple addresses that can be leveraged for round-robin DNS, allowing client connections to be distributed among a pool of NFS nodes. VIPs also enable high availability (HA) NFS. In a HA NFS system, when an NFS node fails, data requests are satisfied by other NFS nodes in the pool. Use a minimum of one VIP per NFS node per NIC that clients will use to connect to the NFS server. If you have four nodes with four NICs each, with each NIC connected to an individual IP subnet, use a minimum of 16 VIPs and direct clients to the VIPs in round-robin fashion. The VIPs should be in the same IP subnet as the interfaces to which they will be assigned. See Setting Up VIPs for NFS for details on enabling VIPs for your cluster.
Here are a few tips:
- Set up NFS on at least three nodes if possible.
- All NFS nodes must be accessible over the network from the machines where you want to mount them.
- To serve a large number of clients, set up dedicated NFS nodes and load-balance between them. If the cluster is behind a firewall, you can provide access through the firewall via a load balancer instead of direct access to each NFS node. You can run NFS on all nodes in the cluster, if needed.
- To provide maximum bandwidth to a specific client, install the NFS service directly on the client machine. The NFS gateway on the client manages how data is sent in or read back from the cluster, using all its network interfaces (that are on the same subnet as the cluster nodes) to transfer data via MapR APIs, balancing operations among nodes as needed.
- Use VIPs to provide High Availability (HA) and failover.
To add the NFS service to a running cluster, use the instructions in Adding Roles to a Node to install the
mapr-nfs package on the nodes where you would like to run NFS.
NFS Memory Settings
The memory allocated to each MapR service is specified in the
/opt/mapr/conf/warden.conf file, which MapR automatically configures based on the physical memory available on the node. You can adjust the minimum and maximum memory used for NFS, as well as the percentage of the heap that it tries to use, by setting the
min parameters in the
warden.conf file on each NFS node. Example:
The percentages need not add up to 100; in fact, you can use less than the full heap by setting the
heapsize.percent parameters for all services to add up to less than 100% of the heap size. In general, you should not need to adjust the memory settings for individual services, unless you see specific memory-related problems occurring.
Running NFS on a Non-standard Port
To run NFS on an arbitrary port, modify the following line in
-p <portnumber> to the end of the line, as in the following example:
warden.conf, restart the MapR NFS server by issuing the following command:
You can verify the port change with the
rpcinfo -p localhost command.
Enabling Debug Logging for NFS
Debug-level logging is available to help you isolate and identify NFS-related issues. To enable logging at the debug level, enter this command at the command line:
-port 9998 indicates NFS.
In default mode, information is logged to a buffer and dumped periodically. To display information immediately instead, enable
continuous mode by entering:
Sample log output from an
ls command is shown here:
The log shows every operation sent to and received from an NFS client.
To return to the default log level and log mode, enter: