This is documentation for MapR Version 5.0. You can also refer to MapR documentation for the latest release.

Skip to end of metadata
Go to start of metadata

All Hadoop commands are invoked by the bin/hadoop script.

When you run these commands, you can specify the MapReduce mode in two different ways:

  1. Use the hadoop keyword and specify the mode explicitly, where classic mode refers to Hadoop 1.x and yarn mode refers to Hadoop 2.x.
  2. Use the hadoop1 or hadoop2 keyword and do not specify the mode.

For example, the following commands are equivalent:

Syntax Summary

The following syntax summary applies to all commands:

hadoop [-yarn|-classic] [--config confdir] [COMMAND] [GENERIC_OPTIONS] [COMMAND_OPTIONS]
 
hadoop1 [--config confdir] [COMMAND] [GENERIC_OPTIONS] [COMMAND_OPTIONS]
 
hadoop2 [--config confdir] [COMMAND] [GENERIC_OPTIONS] [COMMAND_OPTIONS]


Hadoop has an option parsing framework that employs parsing generic options as well as running classes.

COMMAND_OPTION

Description

-mode

Specifies the Hadoop version: yarn or classic

Aternatively, you can use a hadoop1 or hadoop2 command without setting the mode.

If you use a hadoop command (instead of hadoop1 or hadoop2) and do not set the mode, the command runs in the mode set by the MAPR_MAPREDUCE_MODE environment variable.

If this variable is not set, the command runs in the mode set in the hadoop version file on the node (default_mode = yarn or classic).

--config confdir

Overwrites the default Configuration directory. Default is ${HADOOP_HOME}/conf.

COMMAND

Various commands with their options are described in the following sections.

GENERIC_OPTIONS

The common set of options supported by multiple commands.

COMMAND_OPTIONS

Various command options are described in the following sections.

Useful Information

Icon

Running the hadoop script without any arguments prints the help description for all commands.

Supported Commands for Hadoop 1.x

MapR supports the following hadoop commands for Hadoop 1.x:

Command

Description

archive -archiveName NAME <src>* <dest>

The hadoop archive command creates a Hadoop archive, a file that contains other files. A Hadoop archive always has a *.har extension.

classpath

The hadoop classpath command prints the class path needed to access the Hadoop JAR and the required libraries.

conf

The hadoop conf command prints the configuration information for the current node.

daemonlog

The hadoop daemonlog command may be used to get or set the log level of Hadoop daemons.

distcp <source> <destination>

The hadoop distcp command is a tool for large inter- and intra-cluster copying. It uses MapReduce to effect its distribution, error handling and recovery, and reporting. It expands a list of files and directories into input to map tasks, each of which will copy a partition of the files specified in the source list.

fs

The hadoop fs command runs a generic filesystem user client that interacts with the MapR filesystem (MapR-FS).

jar <jar>

The hadoop jar command runs a JAR file. Users can bundle their MapReduce code in a JAR file and execute it using this command.

job

Manipulates MapReduce jobs.

mfs

The hadoop mfs command performs operations on directories in the cluster. The main purposes of hadoop mfs are to display directory information and contents, to create symbolic links, and to set compression and chunk size on a directory.

mradmin

Runs a MapReduce admin client.

pipes

Runs a pipes job.

queue

Gets information about job queues.

version

The hadoop version command prints the Hadoop software version.

Supported Commands for Hadoop 2.x

MapR supports the following hadoop commands for Hadoop 2.x:

Command

Description

archive -archiveName NAME <src>* <dest>

Creates a Hadoop archive, a file that contains other files. A Hadoop archive always has a .har extension.

CLASSNAME

The hadoop script can be used to invoke any class.

hadoop CLASSNAME runs the class named CLASSNAME.

classpath

Prints the class path needed to access the Hadoop JAR and the required libraries.

conf

The hadoop conf command prints the configuration information for the current node.

daemonlog

The hadoop daemonlog command may be used to get or set the log level of Hadoop daemons.

distcp <source> <destination>

The hadoop distcp command is a tool for large inter- and intra-cluster copying. It uses MapReduce to effect its distribution, error handling and recovery, and reporting. It expands a list of files and directories into input to map tasks, each of which will copy a partition of the files specified in the source list.

fs

The hadoop fs command runs a generic filesystem user client that interacts with the MapR filesystem (MapR-FS).

jar <jar>

The hadoop jar command runs a JAR file. Users can bundle their MapReduce code in a JAR file and execute it using this command.

mfs

The hadoop mfs command performs operations on directories in the cluster. The main purposes of hadoop mfs are to display directory information and contents, to create symbolic links, and to set compression and chunk size on a directory.

version

The hadoop version command prints the Hadoop software version.

Icon

For Hadoop2, some hadoop commands are deprecated and replaced by the mapred command.

For example, if you run the hadoop job command, you see this message:

The syntax for the mapred command is:

Commands used with mapred include:

CommandDescription
historyserverRuns job history servers as a standalone daemon
hsadminThe job history server admin interface
jobManipulates MapReduce jobs
pipesRuns a pipes job
queueGets information regarding JobQueues

Unsupported Commands

MapR does not support the following Hadoop commands:

  • balancer

  • datanode

  • dfsadmin

  • fsck

  • fetchdt

  • jobtracker
  • namenode 

  • secondarynamenode

  • tasktracker

Generic Options

Implement the Tool interface to make the following command-line options available for many of the Hadoop commands.

The following generic options are supported by the distcp, fs, job, mradmin, pipes, and queue Hadoop commands:

Generic Option

Description

-conf <filename1 filename2 ...>

Add the specified configuration files to the list of resources available in the configuration.

-D <property=value>

Set a value for the specified Hadoop configuration property.

-fs <local|filesystem URI>

Set the URI of the default filesystem.

-jt <local|jobtracker:port>

Specify a jobtracker for a given host and port. This command option is a shortcut for -D mapred.job.tracker=host:port

-files <file1,file2,...>

Specify files to be copied to the map reduce cluster.

-libjars <jar1,jar2,...>

Specify JAR files to be included in the classpath of the mapper and reducer tasks.

-archives <archive1,archive2,...>

Specify archive files (JAR, tar, tar.gz, ZIP) to be copied and unarchived on the task node.

  • No labels