This is documentation for MapR Version 5.0. You can also refer to MapR documentation for the latest release.

Skip to end of metadata
Go to start of metadata

MapR's Centralized Logging feature provides a job-centric or application-centric view of all log files generated by a MapReduce program. With centralized logging, the log files are written to local volumes in the MapR-FS. You can run the maprcli job linklogs command for running or completed jobs to create a centralized log directory populated with symbolic links to all log files pertaining to the specified jobs or to the application. 

This section contains the following tasks related to managing the centralized logs and viewing logs for completed jobs or applications:

Managing Centralized Logs for MapReduce v1 Jobs

Viewing Logs for Completed MapReduce v1 Jobs

With centralized logging, you can use the maprcli to generate a centralized view of the logs and then view the logs. 

  1. Use the maprcli job linklogs command to create centralized logs for completed jobs.
    For example, you can run the maprcli job linklogs command to do the following:
    -To link logs for a single job (job_201204041514_0001), run maprcli job linklogs -jobid job_201204041514_0001 -todir /myvolume/joblogviewdir

    -To link logs for all jobs by the current shell users, run maprcli job linklogs -jobid job_${USER} -todir /myvolume/joblogviewdir

    -To link logs for all jobs named wordcount1, run maprcli job linklogs -jobid job_*_wordcount1 -todir /myvolume/joblogviewdir

  2. Go to the directory specified by the -todir parameter to view the logs.

Configuring Log Retention Time for MapReduce v1 Jobs

You can configure the log retention time, which defaults to one day after a job completes, in the mapred-site.xml file.The mapred-site.xml file for MapReduce v1 is in the following directory:  /opt/mapr/hadoop/hadoop-0.20.2/conf/mapred-site.xml

  • Set the value of mapred.userlog.retain.hours to the number of hours that you want the cluster to retain logs after a jobs completes.  

Disabling or Enabling Centralized Logging for MapReduce v1 jobs

Centralized logging is disabled by default for MapReduce v1 jobs. 

For MapReduce v1, configure the HADOOP_TASKTRACKER_ROOT_LOGGER property in the hadoop-env.sh file to enable or disable Centralized Logging. The hadoop-env.sh file is located in the following directory: /opt/mapr/hadoop/hadoop-0.20.2/conf/.

  • To disable Centralized Logging, set the value of the HADOOP_TASKTRACKER_ROOT_LOGGER to INFO,DRFA.
  • To enable Centralized Logging, set the value of the HADOOP_TASKTRACKER_ROOT_LOGGER parameter to INFO,maprfsDRFA.
    If you enable centralized logging while jobs are running, restart all JobTrackers. In a production cluster, restart JobTrackers one at a time to prevent interruption to the jobs running. The jobs running during this process may  not have centralized logging enabled.

Managing Centralized Logs for MapReduce v2

Viewing Logs for Completed MapReduce v2 Applications 

With centralized logging, you can use maprcli or the HistoryServer user interface to access completed logs. 

Using the Command Line to View Logs for Completed Applications

  1. Use the maprcli job linklogs command to create centralized logs for completed applications.

    For example, you can run the following maprcli job linklogs command to create centralized logs for application application_1434605941718_0001: 

    Example

    The centralized log directory contains symbolic links that are organized by hostname and containerID. 

  2. To determine where the logs are located, run the following command on the directory that contains the symlinks to the log files for a specific container:


    For example, if you specified 
    logsdir as the directory, you might issue a command similar to the following and the results will display the location of the log files:

    Example

    The link location appears after the arrow.
     

  3. To determine the types of  log files that are available for this container and the path to each available log file, run the following command:

    Example


    In this example, the path the syslog is the only one that is displayed in the output. However, the stdout or stderr may also be available depending on what is generated by the application.
     

  4. Perform one of the following options to view the contents of a log file:

    • To view the end of the log file: 

      Example
    • To view the entire log file: 

      Example

Using the HistoryServer UI to View Logs for Completed Applications

  1. Log on to the MapR Control System.

  2. In the Navigation Pane, click JobHistoryServer 

  3. Click the Job ID link for the job that you want to view the logs for. 

  4. In the Logs column of the Application Master section, click the logs link.

Configuring Log Retention Time for MapReduce v2 Applications

  • Set the value of yarn.log-aggregation.retain-seconds to the number of seconds you want to retain the logs once the application starts in the yarn-site.xml. The value defaults to 30 days. The value that you set for this property also applies to the retention of aggregated YARN logs.
    The yarn-site.xml file is in the following directory:  /opt/mapr/hadoop/hadoop-2.x.x/etc/hadoop

Disabling or Enabling Centralized Logging for MapReduce v2

As of 4.0.2, you can use centralized logging for MapReduce v2 applications but it is disabled by default. In 4.0.1, centralized logging is not supported for MapReduce v2 applications.

Configure the yarn.use-central-logging-for-mapreduce-only property in the yarn-site.xml file to enable or disable centralized logging. The yarn-site.xml file is located in the following directory:  /opt/mapr/hadoop/hadoop-2.x.x/etc/hadoop/.  

  • To disable centralized logging, remove the property yarn.use-central-logging-for-mapreduce-only from the yarn-site.xml  or set the value of yarn.use-central-logging-for-mapreduce-only to false in yarn-site.xml. 

  • To enable centralized logging, set the value of yarn.use-central-logging-for-mapreduce-only to true in yarn-site.xml
    If you enable centralized logging while applications are running, restart all ResourceManagers. In a production cluster, restart ResoureManagers one at a time to prevent interruption to the applications running. The applications running during this process may not have centralized logging enabled. 

     

 

  • No labels