This is documentation for MapR Version 5.0. You can also refer to MapR documentation for the latest release.

Skip to end of metadata
Go to start of metadata

MapR provides volumes as a way to organize data and manage cluster performance. A volume is a logical unit that allows you to apply policies to a set of files, directories, and sub-volumes. A well-structured volume hierarchy is an essential aspect of your cluster's performance. As your cluster grows, keeping your volume hierarchy efficient maximizes your data's availability. Without a volume structure in place, your cluster's performance will be negatively affected.

See the following sections for more information:

When Should I Use Volumes?

You can use volumes to enforce disk usage limits, set replication levels, establish ownership and accountability, and measure the cost generated by different projects or departments. Create a volume for each user, department, or project. You can mount volumes under other volumes to build a structure that reflects the needs of your organization. Sub-volumes are created by mounting a volume in a sub-directory of an already mounted volume. This establishes a parent-child relationship between the volumes whereas the parent volume is mounted in top-level directory and the child volume is mounted in the sub-directory. The volume structure defines how data is distributed across the nodes in your cluster. Create multiple small volumes with shallow paths at the top of your cluster's volume hierarchy to spread the load of access requests across the nodes.

On a cluster with an Enterprise Edition or Enterprise Database Edition license, you can create a special type of volume called a mirror, a local or remote read-only copy of an entire volume. Mirrors are useful for load balancing or disaster recovery. You can also create a snapshot, an image of a volume at a specific point in time. Snapshots are useful for rollback to a known data set. You can create snapshots and synchronize mirrors manually or using a schedule.

Creating a Volume

When creating a volume, the only required parameter is the volume name. You can set the ownership, permissions, quotas, and other parameters at the time of volume creation, or use the Volume Properties dialog to set them later. If you plan to schedule snapshots or mirrors, it is useful to create a schedule ahead of time; the schedule will appear in a drop-down menu in the Volume Properties dialog.

By default, the root user and the volume creator have full control permissions on the volume. You can grant specific permissions to other users and groups:

Code

Allowed Action

dump

Dump the volume

restore

Mirror or restore the volume

m

Modify volume properties, create and delete snapshots

d

Delete a volume

fc

Full control (admin access and permission to change volume ACL)

You can create a volume using the volume create command, or use the following procedure to create a volume using the MapR Control System.

To create a volume using the MapR Control System:

  1. In the Navigation pane, expand the MapR-FS group and click the Volumes view.
  2. Click the New Volume button to display the New Volume dialog.
  3. Type a name for the volume or source volume in the Volume Name field.
  4. You can set a mount path for the volume by typing a path in the Mount Path field.
  5. You can specify whether the volume is read only in MaprR-FS. Note that this setting can be changed after the volume is created.
  6. Specify the user or group accountable for the volume usage in the Accountable Entity field by selecting User or Group from the dropdown menu and type the user or group name in the text field.
  7. You can set permissions using the fields in the Permissions section:
    1. To set the root directory permissions, specify the unix octal permissions.
    2. To set the volume permissions per user, specify the permission for the root user, by selecting the permissions in the right field.
    3. To add additional permissions per user, click [ + Add Permission ] to display fields for a new permission.
      1. In the left field, type either u: and a user name, or g: and a group name.
      2. In the right field, select permissions to grant to the user or group.
  8. You can associate a standard volume with an accountable entity and set quotas in the Usage Tracking section:
    1. To set an advisory quota, select the checkbox beside Volume Advisory Quota and type a quota (in megabytes) in the text field.
    2. To set a hard quota, select the checkbox beside Volume Hard Quota and type a quota (in megabytes) in the text field.
  9. You can set replication factors  and replication optimization in the Replication section:
    1. Specify the desired replication factors in the Replication and Min Replication fields. 
    2. Specify the desired name space container replication factors in the NS Replication and NS Min Replication fields.
    3. Specify how you want to optimize your replication in the Optimize Replication For field. Options are for either high throughput or low latency.
  10. You can enable auditing and set coalescence in the Auditing section. Auditing is enabled or disabled for all actions on the volume. The Coalescence interval is specified in minutes (default: 60 minutes).
  11. Click OK to create the volume.

Specifying Volume Inheritance

When creating and mounting a volume, the location of the mount path is specified by the path parameter. Volumes can be mounted via the web console,  the maprcli commands, or the REST commands. The maprcli commands include volume create -path command and the mapcli volume mount -path command if the volume was previously created. Sub-volumes (children) can inherit properties from their parent volume.

Through the maprcli volume create and volume modify commands (unavailable through the web console), sub-volumes (children) can inherit properties from their parent volume. The maprcli volume create and volume modify commands provide parameters (allowgrantallowinherit, and inherit) for setting the inheritance feature.

Icon

Volume inheritance is unavailable through the web console.


Determining Replication Factors

Volumes are stored as pieces called containers that contain files, directories, and other data. By default, the maximum container size is 32 GB. The maximum container size is set by the cldb.container.sizemb parameter (see the config commands). Containers are replicated to protect data. Normally, each container has three copies stored on separate nodes to provide uninterrupted access to all data, even if a node fails.

For each volume, you can specify a desired and minimum replication factor, a desired and minimum namespace replication factor, and an optimization replication factor.

Icon

When enabled, the CLDB manages the namespace container replication separate from the data container replication. This capability is used when you have low volume replication but want to have higher namespace replication. Note: The namespace container parameters, nsreplication or nsminreplication.  must be the same or larger than the equivalent data replication parameter, replication or minreplication.

 

  • The replication factor is the number of replicated copies you want for normal operation and data protection. When the number of copies falls below the desired replication factor, but remains equal to or above the minimum replication factor, the CLDB actively creates additional copies of the container while trying to minimize the impact of making an additional copy of the container. Re-replication occurs after the timeout specified in the cldb.fs.mark.rereplicate.sec parameter (configurable using the config API). The minimum replication factor is 1 and the maximum is 6 (default: 3).
  • The minimum  replication factor is the smallest number of copies you want in order to adequately protect against data loss. When the replication factor falls below this minimum, re-replication occurs as aggressively as possible to restore the replication level. The minimum replication factor is 1 and the maximum is 6 (default: 2). In all cases, the minimum replication factor cannot be greater than the replication factor.
  • The namespace replication factor is the minimum number of namespace container replicated copies you want for normal operation and data protection. When the number of copies falls below the desired replication factor, but remains equal to or above the minimum replication factor, the CLDB actively creates additional copies of the container while trying to minimize the impact of making an additional copy of the container. Re-replication occurs after the timeout specified in the cldb.fs.mark.rereplicate.sec parameter. The minimum replication factor is 1 and the maximum is 6 (default: 3).
  • The minimum namespace replication factor is the minimum number of namespace container replicated copies you want in order to adequately protect against data loss. When the replication factor falls below this minimum, re-replication occurs as aggressively as possible to restore the replication level. The system will not wait for lost replicas to become available again. The minimum replication factor is 1 and the maximum is 6 (default: 2). In all cases, the minimum replication factor cannot be greater than the replication factor.

If any containers in the CLDB volume fall below the minimum replication factor, writes are disabled until aggressive re-replication restores the minimum level of replication. If a disk failure is detected, any data stored on the failed disk is re-replicated without regard to the timeout specified in the cldb.fs.mark.rereplicate.sec parameter.

If namespace (NS) replication and minimum namespace replication are not set explicitly, they assume the same values as replication and minimum replication respectively. This means that all changes to replication and minreplication parameters are also reflected in nsreplication and nsminreplication. If nsreplication or nsminreplication is modified or specified during creation, nsreplication and nsminreplication start assuming values different from replication and minreplication.

Viewing a List of Volumes

You can view all volumes using the volume list command, or view them in the MapR Control System using the following procedure.

To view all volumes using the MapR Control System:

In the Navigation pane, expand the MapR-FS group and click the Volumes view.

If you have a fresh MapReduce Classic (MapReduce1) installation and configuration, the following volumes appear:

Local volumes

mapr.<nodename1>.local.audit
mapr.<nodename1>.local.logs
mapr.<nodename1>.local.mapred
mapr.<nodename1>.local.metrics
.
.
.
mapr.<nodenamen>.local.audit
mapr.<nodenamen>.local.logs
mapr.<nodenamen>.local.mapred
mapr.<nodenamen>.local.metrics

The <nodenamen> represents local volumes that bear the names of each node in the cluster.

Mounted system volumes

mapr.cluster.root
mapr.configuration
mapr.hbase
mapr.jobtracker.volume
mapr.metrics
mapr.opt
mapr.tmp
mapr.var
users

Unmounted system volumes

mapr.cldb.internal

If you have a fresh YARN (MapReduce2) installation and configuration, the list includes all of these volumes, except the mapr.jobtracker.volume, and also includes the following system volume.

YARN system volume

mapr.resourcemanager.volume

If you have a fresh mixed-mode installation and configuration, the list of includes all of these volumes.

The mapr.jobtracker.volume and mapr.resourcemanager.volume require special handling. Proper planning and sizing of these volumes is critical to cluster health.

Viewing Volume Properties

You can view volume properties using the volume info command, or use the following procedure to view them using the MapR Control System.

To view the properties of a volume using the MapR Control System:

  1. In the Navigation pane, expand the MapR-FS group and click the Volumes view.
  2. Display the Volume Properties dialog by clicking the volume name, or by selecting the checkbox beside the volume name, then clicking the Properties button.
  3. After examining the volume properties, click Close to exit without saving changes to the volume.

Modifying a Volume

 

 You can modify any attributes of an existing volume, except that normal (read-write) volumes cannot be converted to mirror (read-only) volumes.

You can modify a volume using the volume modify command, or use the following procedure to modify a volume using the MapR Control System.

To modify a volume using the MapR Control System:

  1. In the Navigation pane, expand the MapR-FS group and click the Volumes view.
  2. Display the Volume Properties dialog by clicking the volume name, or by selecting the checkbox beside the volume name then clicking the Properties button.
  3. Make changes to the fields. See Creating a Volume for more information about the fields.
  4. After examining the volume properties, click Modify Volume to save changes to the volume.

Mounting a Volume

You can mount a volume using the volume mount command, or use the following procedure to mount a volume using the MapR Control System.

To mount a volume using the MapR Control System:

  1. In the Navigation pane, expand the MapR-FS group and click the Volumes view.
  2. Select the checkbox beside the name of  each volume you wish to mount.
  3. Click the Mount button.

You can also mount or unmount a volume using the Mounted checkbox in the Volume Properties dialog. See Modifying a Volume for more information.

Unmounting a Volume

You can unmount a volume using the volume unmount command, or use the following procedure to unmount a volume using the MapR Control System.

To unmount a volume using the MapR Control System:

  1. In the Navigation pane, expand the MapR-FS group and click the Volumes view.
  2. Select the checkbox beside the name of each volume you wish to unmount.
  3. Click the Unmount button.

You can also mount or unmount a volume using the Mounted checkbox in the Volume Properties dialog. See Modifying a Volume for more information.

Removing a Volume or Mirror

You can remove a volume using the volume remove command, or use the following procedure to remove a volume using the MapR Control System.

To remove a volume or mirror using the MapR Control System:

  1. In the Navigation pane, expand the MapR-FS group and click the Volumes view.
  2. Click the checkbox next to the volume you wish to remove.
  3. Click the Remove button to display the Remove Volume dialog.
  4. In the Remove Volume dialog, click the Remove Volume button.
  • No labels