This is documentation for MapR version 4.0.x. You can also refer to MapR documentation for the latest or previous releases.

Skip to end of metadata
Go to start of metadata

In general, mirror volumes are created for the purpose of preventing or minimizing data loss. Data loss scenarios range from accidental overwrites to rack failure, to a disaster that destroys an entire datacenter. Mirror volumes are also used to improve performance or to make copies of data for use in other clusters without impacting production.

As of the 4.0.2 release, all new mirror volumes can be made into read-write volumes. In addition, read-write volumes that were mirrored to other volumes can be made into mirrors (to establish a mirroring relationship in the other direction). This functionality is useful in scenarios such as:

  • Disaster recovery
    If a read-write volume with critical data goes down in a primary datacenter, a mirror volume in a remote datacenter can be made into a read-write volume in order to maintain business continuity. Later, if the primary datacenter comes back online, the original mirror relationship can be restored by making the new read-write volume back into a mirror volume. 
  • Running applications on a copy of production data
  • Resynchronization (reestablishing a mirror relationship after it is broken)
Icon

Existing volumes cannot use promotable mirrors; only new volumes can have promotable mirrors. Customers who upgrade to 4.0.2 need to follow some specific manual steps in order to use promotable mirrors for volumes that were in place before the upgrade. See Working with Volumes after Upgrading from 3.x or 4.0.1 to 4.0.2 for instructions.

The remainder of this page focuses on how to incorporate mirror volumes into a disaster recovery plan. Of course, the techniques explained here can be used in other scenarios, too.


Incorporating Mirror Volumes into a Disaster Recovery Plan

Mirroring critical data to a remote datacenter (with the ability to make mirror volumes into read-write volumes) addresses these two objectives:

  • Recovery Point Objective (RPO) - the age of the files you need to recover (and how much data you can afford to lose)

  • Recovery Time Objective (RTO) - how soon you need to have a working datacenter in order to maintain business continuity

In a typical scenario that employs remote mirrors, the contents of a source volume are mirrored to a mirror volume in a remote cluster at a frequency specified by the mirror schedule. At the start of each mirror operation, a snapshot is taken of the source volume's contents. The mirror operation takes some time to complete, and while the data is being copied from the snapshot to the mirror volume, more data is written to the source volume. This data will be captured during the next mirror operation. When each mirror operation completes, the contents of the mirror volume are identical to the contents of the source volume at the time of the snapshot. For subsequent mirror operations, only the incremental changes (additions and deletions) are copied to the mirror volume, which synchronizes its contents with the contents of the source volume at the time of the snapshot.

 

 

If the source cluster goes down, any data written to the source volume since the last successful mirror operation cannot be copied to the mirror. The amount of data lost depends on the number of write operations in the interval from the last successful mirror to the time the cluster goes down.

Factors that Affect RTO

During a disaster, an administrator must first determine that the link is down between the primary datacenter and the secondary datacenter. Next, the administrator begins the process of switching applications that were running on the primary datacenter over to the secondary datacenter. For write applications, the administrator begins converting mirror volumes to read-write volumes, starting with volumes that contain the most critical data. Note that read applications can run on read-only mirrors, but write applications can only run on read-write volumes.

To gauge how long it will take to switch applications from the primary datacenter to the secondary datacenter (and to set the RTO accordingly), consider these factors:

  • Detection time (how long it takes to determine that the link is down between the two datacenters)
  • Switching time (how long it takes to switch applications from one datacenter to the other)
  • Promotion time (how long it takes to change read-only mirror volumes to read-write volumes that can run write applications). Promotion time is a function of:
    • the size of each volume (larger volumes take longer to change than smaller ones)
    • the number of files in each volume (the smaller the number of files to change, the faster the conversion)
    • the size of each file (file size determines number of files - the larger each file is, the fewer files you need to store data; fewer files take less time to change)
    • the number of containers in each volume
    • the number of volumes
  • Whether mirror throttling is enabled (the default) or disabled (which speeds up the mirroring process)

Factors that Affect RPO

Various factors affect how much data can be recovered through the use of mirror volumes. To specify a realistic recovery point objective in your disaster recovery plan, take these factors into account:

  • Mirror schedule (how often the mirror is synchronized with its source volume)
    Note that the first mirror operation is a full synchronization between source and mirror volumes. Subsequent mirror operations are incremental - only the changes that occurred since the last mirror event need to be copied in order to synchronize the contents between the two volumes. 
  • Network link between the source volume and the mirror volume (consider the stability and quality of the link, as well as latency, throughput, and other activities across the link)

Using Promotable Mirrors for Disaster Recovery

The following sections describe the tasks that a MapR administrator performs from a remote datacenter before, during, and after a disaster. These tasks include:

For a brief overview of the terminology used to describe volume types, along with some basic commands, see the Glossary.

Setting up Mirroring to a Remote Cluster

Once data volumes are created in a primary datacenter, the MapR administrator creates mirror volumes in a remote secondary datacenter. The following diagram illustrates the mirror relationship between these two volumes:

 

 

Icon

When you use promotable mirrors in 4.0.2, the volumes on the destination cluster must be set up in the same way as on the primary site. This means that volume names are the same and mount points are the same. If a hierarchical mounting structure (such as /A/B) is used on the primary site, the same structure must be recreated once mirror volumes are promoted at the secondary site.


Preparation Steps

Before you can create a mirror volume on a remote cluster, you must first complete these steps:

  1. Check UID consistency and volume permissions
  2. Edit mapr-clusters.conf so every node in each cluster can resolve all nodes in the other cluster.
  3. (For mirroring between secure clusters only) Generate a cross-cluster ticket and append it to the maprserverticket on the destination cluster.

Instructions for each step are explained below.

Checking UIDs and Setting Volume Permissions

Make sure you have the same UID for the MAPR_USER (the cluster owner) for both the primary cluster (where the source volume resides) and the remote clusters (where the mirror volumes reside; also known as the destination clusters). You also need to have these volume permissions:

  • dump permission on the source volumes
  • restore permission on the mirror volumes at the destination clusters

Editing mapr-clusters.conf

Another requirement for mirroring to work between two clusters is that every node in each cluster (the local cluster and the remote cluster) must be able to resolve all nodes in the other cluster. So, before you create a mirror volume, edit the mapr-clusters.conf file on each node as shown:

  1. For each node on the source volume's cluster, add a line that contains the mirror cluster's name, followed by a space-separated list of hostnames for its CLDB nodes. For example, if Cluster1 has cldb1-1 and cldb1-2 for its CLDB nodes, and Cluster 2 has cldb2-1 and cldb2-2 for its CLDB nodes, the mapr-clusters.conf file for Cluster1 would have these lines:

  2. For each node on the mirror volume's cluster, add a line that contains the source cluster's name, followed by a space-separated list of hostnames for its CLDB nodes. For this example, the mapr-clusters.conf file for Cluster2 contains these lines:

  3. Set secure=true if both clusters are secure. Set secure=false if both clusters are not secure.

    Icon

    Mirroring only works between two secure clusters or between two unsecure clusters. Mirroring does not work when one cluster is secure and the other is unsecure.

  4. On each cluster, restart the mapr-webserver service on all nodes where it is running.

Generating a Cross-cluster Ticket

If the source volume at the primary datacenter is in a secure cluster, the destination cluster needs authorization to pull data from the source cluster in order to create a mirror volume. Authorization is granted by means of a cross-cluster ticket generated by the source cluster administrator. Each step in the process is explained below:

  1. (Source cluster administrator) Define a new service user in the source cluster’s UNIX user registry. One way to do this is with the adduser or useradd command (as appropriate for your operating system). For example:

  2. (Source cluster administrator) Generate a cross-cluster ticket for the user you created in step 1. This command can be run from any node in the source cluster.
    The input ticket file specified by -inmaprserverticketfile is the source cluster’s server ticket, which is located at /opt/mapr/conf/maprserverticket. The output ticket file, specified by
    -ticketfile, contains the cross-cluster ticket.

  3. (Source cluster administrator) Provide the cross-cluster ticket to the destination cluster administrator.

  4. (Destination cluster administrator) Append the cross-cluster ticket file (located at /opt/mapr/conf/maprclusterticket) to the maprserverticket (located at /opt/mapr/conf/maprserverticket) on the destination cluster.

  5. (Destination  cluster administrator) Copy the ticket file to all nodes in the destination cluster.

Now you are ready to create a mirror volume from the command line or from the MCS.

Creating a Mirror from the Command Line

To set up a mirror volume in a remote cluster at the secondary datacenter, run the volume create command, as shown in this example:

When you run this command, a mirror volume named volA is created with these specifications:

OptionDescriptionValue used in example
-nameName of the mirror volumevolA
-pathPath to the mirror volume (the mount point)/A
-typeType of volume (either rw or mirror)mirror
-sourceName of the source volume from which the mirror pulls data. For remote mirroring, the name must include the cluster name, since the source is located on a different cluster from the mirror. volA@Cluster1, where Cluster1 resides in the primary datacenter
-scheduleThe ID corresponding to the snapshot schedule for the mirror
volume. These schedule IDs are pre-assigned:
 
  • 1 is for critical data
  • 2 is for important data
  • 3 is for normal data

See Default and User-defined Schedules for information on creating a custom schedule.

1 (for critical data)

Icon

If a mirror volume is changed to a read-write volume, the snapshot schedule specified by the -schedule option will apply to that volume, and the mirror schedule will be disabled.

-mirrorschedule The ID corresponding to the mirror schedule. These schedule IDs are pre-assigned:
  • 1 is for critical data
  • 2 is for important data
  • 3 is for normal data
1 (for critical data)
Icon

If the quota set for the mirror volume is less than the quota set for its source volume, the CLDB raises an alarm. The mirroring operation will not fail, but the administrator must decide whether to add space and increase the mirror volume quota, or remove unwanted space from the source volume and decrease its quota.

Creating a Mirror from the MCS

To create a mirror from the MCS, follow these steps:

  1. Under the MapR-FS group in the navigation pane, click on Volumes, then click on the New Volume tab.
     
    The New Volume dialog box displays.
     
  2. Fill out the New Volume dialog box.
    1. Choose Remote Mirror Volume.
    2. Give the volume a name (the mirror is named volA in this example).
    3. Provide the name of the source volume and the cluster where it resides (the source volume is volA and the cluster is Cluster1 in this example).
    4. Supply the mount path (/A in this example).
    5. Select a snapshot schedule (specifies when to take snapshots of the volume). The default choices are None, Critical data, Important data, and Normal data.

      Icon

      If the mirror volume is later promoted to a read-write volume, the snapshot schedule will apply to that volume and the mirror schedule will be disabled.

    6. Select a mirror schedule (specifies the mirroring interval of the volume). The default choices are None, Critical data, Important data, and Normal data. If you choose None, start the mirror manually by selecting Start Mirroring from the Volume Actions tab.

Choosing a Snapshot Schedule and a Mirror Schedule

When you specify a snapshot schedule on a mirror volume, it specifies how often to take a snapshot of the mirror volume. This snapshot schedule is distinct from the snapshot schedule for the source volume. A snapshot schedule for a promotable mirror volume has two purposes:

  • It specifies how often to take a snapshot of the mirror volume for the purpose of preserving the state of the mirror before a subsequent mirror operation. This way, if corrupt data is copied from the source volume's snapshot into the mirror volume, the mirror contents can be rolled back to the snapshot.
  • If the promotable mirror volume is promoted to a read-write volume, the snapshot schedule specified for the mirror is used for the promoted read-write volume. Once a mirror volume is promoted to a read-write volume, the mirror schedule is disabled.

mirror schedule specifies how frequently the mirror volume is synchronized with the source volume. In case of a disaster (or any type of data loss on a read-write source volume), the data can be recovered from the mirror volume, but any data written to the source volume since the last successful mirror operation will not be on the mirror volume. Therefore, you should set the mirror schedule such that it meets your RPO (Recovery Point Objective).

Default and User-defined Mirror Schedules

When you create a mirror from the MCS, you can select from three classes of data that correspond to different default mirror schedules:

In addition, you can create your own user-defined schedule, which will also appear as a selection in the dropdown menu. 

To define a new schedule:

  1. Select MapR-FS > Schedules from the MCSClick the  button. The following dialog appears.
     
  2. Enter a schedule name and schedule rules. Click  to add more rules for mirror schedules. For example, a schedule named Customer Data could be defined like this:
     
  3. Click the  button.
  4. Note that Customer Data now appears in the list of Schedule Names on the Schedules tab of the MCS.
     
  5. Note that when you create a new volume, the snapshot scheduling dropdown menu includes the new schedule.

Guidelines for Setting Mirror Schedules

Although MapR allows mirroring frequencies up to once per minute, setting a schedule at this frequency is not practical nor advisable. When you choose the mirror schedule, consider the amount of data on the volume and the load on the cluster. Remember that the mirroring frequency must allow enough time to complete one mirror operation before the next scheduled mirror operation starts. In addition, if you have a cascaded mirror setup (where A mirrors to B which mirrors to C), you cannot set a mirror schedule for C that starts before B finishes mirroring from A.

Icon

In general, you should not set a mirror schedule for more often than every 30 minutes.

If you set a mirror schedule to start mirroring before the previous mirror operation finishes, you will see an error message similar to this:

Failing Over to a Mirror Volume

When a disaster occurs at a primary datacenter, data can no longer be written to the volumes in that location, and the mirror operation cannot be performed. In order to maintain business continuity, the administrator at the secondary datacenter promotes the read-only mirror volume to a read-write volume, which breaks the mirror relationship. At this point, the promoted mirror volume contains all the data that was on the source volume at the time of the most recent successful mirror operation.


Promoting a Volume from the Command Line

To promote a read-only mirror to a read-write volume from the command line, run the volume modify command from the cluster where the mirror resides and specify the name of the mirror volume that is being promoted. In this example, the mirror volume is named volA:

 

Icon

Once a mirror volume is promoted to a read-write (rw) volume, the mirroring schedule associated with that volume is disabled.

To promote several mirror volumes at once, provide a comma-separated list of volume names. For example:

Promoting a Volume from the MCS

To promote a read-only mirror to a read-write volume from the MCS, follow these steps:

  1. Click on Mirror Volumes in the navigation pane, then check the box to the left of the volume you want to promote. You can promote more than one mirror at at time by checking multiple boxes. 
  2. Click on the Volume Actions tab, then select Make Standard Volume from the dropdown menu.
     

Handling Mount Points in Promoted Mirror Volumes

After you promote read-only mirror volumes to read-write standard volumes, you must re-establish the mount points that were set up in the source cluster. To understand the steps in this process, consider the following scenario:

A source cluster has volumes A, B, and C, which are mounted at /A, /A/B, and /A/B/C respectively. Each source volume is mirrored to a volume in another cluster (the destination cluster). The names of the corresponding mirror volumes are also A, B, and C.

 

Icon

When you use promotable mirrors in 4.0.2, the volumes on the destination cluster must be set up in the same way as on the primary site. This means that volume names are the same and mount points are the same. If a hierarchical mounting structure (such as /A/B) is used on the primary site, the same structure must be recreated once mirror volumes are promoted at the secondary site.

Mirror volume A is mounted at /A, but since the mirror is read-only, no mount point can be created beneath it for mirror B  or mirror C. 


 

Icon

Mirror volumes that are promoted to standard (read-write) volumes are not available for write operations until they are mounted explicitly.

Now suppose that all three mirror volumes are promoted to read-write volumes. Before any data can be written to these volumes, the volume links must be removed and the volumes must be remounted. The commands for each step are shown below.

  1. Promote A, B, and C to read-write volumes.

     

  2. Remove the vol links located at /A/B and /A/B/C. Since mirror A was already mounted, its vol links do not need to be removed.

  3. Mount the promoted read-write volumes B and C at the same mount points used in the primary (source) cluster, in order to maintain an exact replica in the destination cluster.

Now the promoted volumes are accessible for write operations.

Changing the Limit for Concurrent Mirror Operations

The system allows a maximum of 50 concurrent mirroring operations by default. Mirroring operations include both mirroring and promoting from read-only mirrors to read-write standard volumes. The system parameter that controls this limit is mapr.mirror.concurrent.ops.

For large-scale mirror operations involving many volumes, a script automates the process. For example, if a script queues 100 volumes for mirroring operations, and the mapr.mirror.concurrent.ops limit is set to 50, the mirroring operations start on the first 50 volumes in the queue. As soon as one volume completes, another volume is processed from the queue until all 100 are completed. Since volumes are processed from the queue in first-in first-out (FIFO) order, the script should specify the most critical volumes first.

If you want to process more volumes at a time, you can raise the limit of the mapr.mirror.concurrent.ops parameter. To tune this parameter for maximum efficiency, consider the number of containers per volume. A higher number of containers per volume requires a lower limit than a lower number of containers per volume. To raise the limit to 500 for example, run the following command:

 

 

Restoring the Mirror Relationship

If the primary datacenter comes back online, the administrator can re-establish the mirror relationship between the original read-write volume in the primary datacenter and the promoted read-write volume in the secondary datacenter.

Note that the two read-write volumes will have different data, since data was written to the promoted mirror while the original source volume was down. The original source volume might also have different data that was written after the last mirror operation, but before the cluster went down. The administrator must decide which data to keep and use as the source.



Icon

Some data loss is inevitable in a disaster recovery scenario. To minimize potential data loss, use mirrors to provide a synchronized copy of each volume with critical data, and in the event of discrepancies, decide which data to preserve based on your company's policies.

Preserving volA/Cluster1's Data

Suppose that volA in the primary datacenter contains crucial data that must be preserved, and you want to mirror its data to volA in the secondary datacenter (the same mirror relationship that was established originally). To recreate the original mirror relationship, convert the promoted volume, volA/Cluster2, from a read-write volume to a mirror of volA/Cluster1 by running the following command:

To use the MCS to convert volA/Cluster2 from a read-write volume to a mirror of volA/Cluster1, perform these steps from Cluster2:

  1. Select MapR-FS > Volumes from the navigation pane and click on the checkbox next to the read-write volume you want to convert (volA in the example).
  2. Click on the Volume Actions tab and select Make Mirror Volume.

  3. Select MapR-FS > Mirror Volumes and verify that volA on Cluster2 is displayed as a Standard Mirror.

Preserving volA/Cluster2's Data on volA/Cluster1

Now suppose you want to preserve the data on volA/Cluster2 (in the remote datacenter) but you still want volA/Cluster1 to be the primary volume with volA/Cluster2 as its mirror. To save volA/Cluster2's data on volA/Cluster1 and reestablish the original mirror relationship from volA/Cluster1 to volA/Cluster2, follow the steps below.

From the command line

  1. Stop writing new data to volA/Cluster2. To be sure no data is written to this volume, make it read-only by running this command:

  2. Pull the data from volA/Cluster2 to volA/Cluster1 by making volA/Cluster1 a mirror of volA/Cluster2.

  3. Start the mirror operation.

  4. Once mirroring is done, promote volA to a read-write volume. Note that the mirror relationship breaks at this point.

  5. Make volA/Cluster2 a mirror of volA/Cluster1.

From the MCS

  1. Stop writing new data to volA/Cluster2 by making this volume read-only:
    1. Click on the checkbox next to volA in the Volumes display.
    2. Click on the name of the volume to display the Volume Properties dialog.
    3. In the Volume Properties dialog, check the Read-only box and click OK.

  2. Make volA/Cluster1 a mirror of volA/Cluster2.
    1. Navigate to the node that is running the mapr-webserver service on Cluster1 to access the MCS.
    2. Select MapR-FS > Volumes from the navigation pane and click on the checkbox next to volA.
    3. From the Volume Actions tab, select Make Mirror Volume.
       
    4. Fill in the Source Volume name field (the source volume is volA in this example) and click OK
       
  3. Start mirroring.
     
  4. Promote volA to a read-write volume.
    1. In the Mirror Volumes display, check the box next to volA.
    2. Click on the Volume Actions tab and select Make Standard Volume.
       
  5. Make volA/Cluster2 a mirror of volA/Cluster1.
    1. Navigate to the node that is running the mapr-webserver service on Cluster2 (where volA/Cluster2 resides) to access the MCS.
    2. In the Volumes display, check the box next to volA.
    3. Click on the Volume Actions tab and select Make Mirror Volume.

Working with Volumes after Upgrading from 3.x or 4.0.1 to 4.0.2

If you created volumes in a cluster running 3.x or 4.0.1, and then you upgrade your cluster to 4.0.2, all the old volumes are preserved. When you create new volumes, the type (standard or non-convertible) is determined as follows:

  • If you create a new read-write volume, it is a standard volume (non-convertible standard volumes cannot be created on a 4.0.2 cluster).
  • If you create a mirror from a non-convertible standard volume, the mirror is a non-convertible mirror.
  • If you create a mirror from a standard volume, the mirror is a standard mirror.

After you upgrade to 4.0.2, follow these steps so you can start using the promotable mirror volume functionality:

  1. Enable the promotable mirror volume functionality by setting the parameter mfs.feature.rwmirror.support to 1:

  2. If a non-convertible standard volume contains critical data, move the data to a standard volume:
    1. Create a new standard volume.

    2. Stop writing to the non-convertible standard volume.

    3. Copy the data from the existing standard volume into the convertible rw volume.

  3. Create a promotable mirror volume on a remote cluster that can be used in case of a disaster.

If you use automated scripts that specify the old volume types 0 and 1, these types are mapped to rw and mirror respectively for backward compatibility. For example:

  1. If a command uses -type 0 to create a standard rw volume, it is mapped to -type rw. The resulting volume is a convertible rw volume (mirrortype:3).

    is mapped to

  2. If a command uses -type 1 to create a mirror volume from a non-convertible standard (read-write) volume (created before 4.0.2), it is mapped to -type mirror. The resulting volume is a non-convertible mirror volume, since the source is a non-convertible standard volume.

    is mapped to

  3. If a command uses -type 1 to create a mirror volume from a standard rw volume, it is mapped to -type mirror. The resulting volume is a standard mirror volume, since the source is a standard rw volume.

    is mapped to

Glossary

This glossary explains the terms used in the MCS for the different types of volumes. A sample display from the Type column is shown here:


TermDefinition
NC Standard Volume

A non-convertible standard volume with read-write capabilities, created before MapR version 4.0.2. These volumes cannot be converted to standard mirror volumes. If this volume type is designated as a source volume when a mirror volume is created, the mirror volume will be an NC mirror volume.

When you move the mouse over the NC Standard Volume text in the MCS, the following tooltip is displayed:


An NC standard volume is designated as type 0 in the output of the volume info command. For example:

maprcli volume info -name oldrw

lists

“mirrortype”:0

Standard VolumeA read-write volume created with MapR version 4.0.2. A standard volume can be converted from read-write to mirror (read-only). If a mirror is created from this type of volume, the mirror can be promoted to a read-write volume.

A standard volume is designated as type rw on the command line. For example:

maprcli volume create -name volA -path /testvol -type rw

NC Mirror Volume

A non-convertible read-only mirror volume created before MapR version 4.0.2. This volume type cannot be promoted to a read-write volume, and can only be created from an NC standard volume.

When you move the mouse over the NC Mirror Volume text in the MCS, the following tooltip is displayed:

An NC mirror volume is designated as type 1 in the output of the volume info command. For example:

maprcli volume info -name oldmirror

lists

“mirrortype”:1

Standard Mirror

A mirror volume that starts as read-only, and can be promoted to a read-write volume.

A standard mirror volume is designated as type mirror on the command line and can only use a standard volume as its source. For example:

maprcli volume create -name volB -path /mirvol -type mirror -source volA

  • No labels