Step 1. Planning the Upgrade | Step 2. Preparing to Upgrade | Step 3. Upgrading with the MapR Installer or Upgrading Without the MapR Installer | Step 4. Upgrading MapR Clients
After you have planned your upgrade process, you are ready to prepare the cluster for upgrade. Perform these preparation steps while your existing cluster is fully operational.
This page contains the following topics:
The goal of performing these steps early is to minimize the number of operations within the maintenance window, which reduces downtime and eliminates unnecessary risk. It is possible to move some of these steps into the Upgrading Without the MapR Installer flow, which will reduce the number of times you have to touch each node, but increase down-time during upgrade. Design your upgrade flow according to your needs.
1. Verify System Requirements for All Nodes
Verify that all nodes meet the minimum requirements for the new version of MapR software. Check:
- Software dependencies. Packages dependencies in the MapR distribution can change from version to version. If the new version of MapR has dependencies that were not present in the older version, you must address them on all nodes before upgrading MapR software. Installing dependency packages can be done while the cluster is operational. See Packages and Dependencies for MapR Software. If you are using a package manager, you can specify a repository that contains the dependency package(s), and allow the package manager to automatically install them when you upgrade the MapR packages. If you are installing from package files, you must pre-install dependencies on all nodes manually.
- Hardware requirements. The newer version of packages might have greater hardware requirements. Hardware requirements must be met before upgrading. See Preparing Each Node in the Advanced Installation Topics.
- OS requirements. MapR’s OS requirements do not change frequently. If the OS on a node doesn’t meet the requirements for the newer version of MapR, plan to decommission the node and re-deploy it with updated OS after the upgrade.
- Certificate requirements. Recent versions of Safari and Chrome web browsers have removed support for older certificate cipher algorithms, including those used by some versions of MapR. If you have not already done so, complete the step to resolve the MapR Control System Certificate Issue.
2. Design Health Checks
Plan what kind of test jobs and scripts you will use to verify cluster health as part of the upgrade process. You will verify cluster health several times before, during, and after upgrade to ensure success at every step, and to isolate issues whenever they occur. Create both simple tests to verify that cluster services start and respond, as well as non-trivial tests that verify workload-specific aspects of your cluster.
2a. Design Simple Tests
Examples of simple tests:
Check node health using
maprclicommands to verify if any alerts exist and that services are running where they are expected to be. For example:
In this example you can see that an alarm is raised indicating that MapR is expecting an NFS server to be running on node
centos58, and the
node listof running services confirms that the
nfsservice is not running on this node.
- Batch create a set of test files.
- Submit a MapReduce job.
- Run simple checks on installed Hadoop ecosystem components. For example:
- Make a Hive query.
- Do a put and get from Hbase.
hbase hbckto verify consistency of the HBase datastore. Address any issues that are found.
2b. Design Non-trivial Tests
Appropriate non-trivial tests will be specific to your particular cluster’s workload. You may have to work with users to define an appropriate set of tests. Run tests on the existing cluster to calibrate expectations for “healthy” task and job durations. On future iterations of the tests, inspect results for deviations. Some examples:
- Run performance benchmarks relevant the cluster’s typical workload.
- Run a suite of common jobs. Inspect for correct results and deviation from expected completion times.
- Test correct inter-operation of all components in the Hadoop stack and third-party tools.
- Confirm integrity of critical data stored on cluster.
3. Verify Cluster Health
Verify cluster health before beginning the upgrade process. Proceed with the upgrade only if the cluster is in an expected, healthy state. Otherwise, if cluster health does not check out after upgrade, you can’t isolate the cause to be related to the upgrade.
3a. Run Simple Health Checks
Run the suite of simple tests to verify that basic features of the MapR core are functioning correctly, and that any alarms are known and accounted for.
3b. Run Non-trivial Health Checks
Run your suite of non-trivial tests to verify that the cluster is running as expected for typical workload, including integration with Hadoop ecosystem components and third-party tools.
4. Back Up Critical Data
Data in the MapR cluster persists across upgrades from version to version. However, as a precaution you might want to back up critical data before upgrading. If you deem it practical and necessary, you can do any of the following:
- Copy data out of the cluster using
distcpto a separate, non-Hadoop datastore.
- Mirror critical volume(s) into a separate MapR cluster, creating a read-only copy of the data which can be accessed via the other cluster.
When services for the new version are activated, MapR-FS will update data on disk automatically. The migration is transparent to users and administrators. Once the cluster is active with the new version, you typically cannot roll back.
5. Perform MapR Installer Pre-Upgrade Steps
If you use the MapR Installer to upgrade the cluster, ecosystem components may automatically be updated with the latest package or you may need to upgrade them to be compatible with the cluster version. Therefore, perform the pre-upgrade steps for each ecosystem component in your cluster. See the Perform MapR Installer Pre-Upgrade Steps topic for details.
6. Run Your Upgrade Plan on a Test Cluster
Before executing your upgrade plan on the production cluster, perform a complete "dry run" on a test cluster. You can perform the dry run on a smaller cluster than the production cluster, but make the dry run as similar to the real-world circumstances as possible. For example, install all Hadoop ecosystem components that are in use in production, and replicate data and jobs from the production cluster on the test cluster.
The goals for the dry run are:
- Eliminate surprises. Get familiar with all upgrade operations you will perform as you upgrade the production cluster.
- Uncover any upgrade-related issues as early as possible so you can accommodate them in your upgrade plan. Look for issues in the upgrade process itself, as well as operational and integration issues that could arise after the upgrade.