MapR formats and uses disks for the Lockless Storage Services layer (MapR-FS), and records these disks in the file disktab. In a production environment, or when testing performance, MapR should be configured to use physical hard drives and partitions. In some cases, it is necessary to reinstall the operating system on a node so that the physical hard drives are available for direct use by MapR. Reinstalling the operating system provides an unrestricted opportunity to configure the hard drives. If the installation procedure assigns hard drives to be managed by the Linux Logical Volume Manager (LVM) by default, you should explicitly remove the drives you plan to use with MapR from the LVM configuration. It is common to let LVM manage one physical drive containing the operating system partition(s) and to leave the rest unmanaged by LVM for use with MapR.
The following procedures are intended for use on physical clusters or Amazon EC2 instances. On EC2 instances, EBS volumes can be used as MapR storage, although performance will be slow.
To determine if a disk or partition is ready for use by MapR:
- Run the command
sudo lsof <partition>to determine whether any processes are already using the disk or partition.
- There should be no output when running
sudo fuser <partition>, indicating there is no process accessing the specific disk or partition.
- The disk or partition should not be mounted, as checked via the output of the
mountcommand. If the disk or partition is mounted, unmount it using the
- The disk or partition should not have an entry in the
/etc/fstabfile; comment out or delete any such entries.
- The disk or partition should be accessible to standard Linux tools such as
mkfs. You should be able to successfully format the partition using a command like
sudo mkfs.ext3 <partition>as this is similar to the operations MapR performs during installation. If
mkfsfails to access and format the partition, then it is highly likely MapR will encounter the same problem.
Any disk or partition that passes the above testing procedure can be added to the list of disks and partitions passed to the
To specify disks or partitions for use by MapR:
disksetup script is used to format disks for use by the MapR cluster. Create a text file
/tmp/disks.txt listing the disks and partitions for use by MapR on the node. Each line lists either a single disk or all applicable partitions on a single disk. When listing multiple partitions on a line, separate by spaces. For example:
Later, when you run
disksetup to format the disks, specify the
disks.txt file. For example:
If you are re-using a node that was used previously in another cluster, be sure to format the disks to remove any traces of data from the old cluster.
To evaluate MapR using a flat storage file instead of formatting disks:
When setting up a small cluster for evaluation purposes, if a particular node does not have physical disks or partitions available to dedicate to the cluster, you can use a flat file on an existing disk partition as the node's storage. Create at least a 16GB file, and include a path to the file in the disk list file for the
The following example creates a 20 GB flat file (
bs=1G specifies 1 gigabyte blocks, multiplied by
Then, you would add the following to the disk list file
/tmp/disks.txt to be used by