Your node topology describes the locations of nodes and racks in a cluster. The MapR software uses node topology to determine the location of replicated copies of data. Optimally defined cluster topology results in data being replicated to separate racks, providing continued data availability in the event of rack or node failure.
Define your cluster's topology by specifying a topology for each node in the cluster. You can use topology to group nodes by rack or switch, depending on how the physical cluster is arranged and how you want MapR to place replicated data.
Topology paths can be as simple or complex as needed to correspond to your cluster layout. In a simple cluster, each topology path might consist of the rack only (for example,
/rack-1). In a deployment consisting of multiple large datacenters, each topology path can be much longer (for example,
/europe/uk/london/datacenter2/room4/row22/rack5/). MapR uses topology paths to spread out replicated copies of data, placing each copy on a separate path. By setting each path to correspond to a physical rack, you can ensure that replicated data is distributed across racks to improve fault tolerance.
After you have defined node topology for the nodes in your cluster, you can use volume topology to place volumes on specific racks, nodes, or groups of nodes. See Setting Volume Topology for more information.
Recommended Node Topology
The node topology described in this section enables you to gracefully migrate data off a node in order to decommission the node for replacement or maintenance while avoiding data under-replication.
/data topology path to serve as the default topology path for the volumes in that cluster. Establish a
/decommissioned topology path that is not assigned to any volumes.
When you need to migrate a data volume off a particular node, move that node from the
/data path to the
/decommissioned path. Since no data volumes are assigned to that topology path, standard data replication will migrate the data off that node to other nodes that are still in the
/data topology path.
You can run the following command to check if a given volume is present on a specified node:
Run this command for each non-local volume in your cluster. Once all the data has migrated off the node, you can decommission the node or place it in maintenance mode.
If you need to segregate CLDB data, create a
/cldb topology node and move the CLDB nodes under
/cldb. Point the topology for the CLDB volume (
/cldb. See Isolating CLDB Nodes for details.
Setting Node Topology Manually
You can specify a topology path for one or more nodes using the
node move command, or in the MapR Control System using the following procedure.
To set node topology using the MapR Control System:
- In the Navigation pane, expand the Cluster group and click the Nodes view.
- Select the checkbox beside each node whose topology you wish to set.
- Click the Change Topology button to display the Change Node Topology dialog.
- Set the path in the New Path field:
- To define a new path, type a topology path. Topology paths must begin with a forward slash ('/').
- To use a path you have already defined, select it from the dropdown.
- Click Move Node to set the new topology.
Setting Node Topology with a Script
For large clusters, you can specify complex topologies in a text file or by using a script. Each line in the text file or script output specifies a single node and the full topology path for that node in the following format:
<ip or hostname> <topology>
The text file or script must be specified and available on the local filesystem on all CLDB nodes:
- To set topology with a text file, set
/opt/mapr/conf/cldb.confto the text file name
- To set topology with a script, set
/opt/mapr/conf/cldb.confto the script file name
If you specify a script and a text file, the MapR system uses the topology specified by the script.