This is documentation for MapR Version 5.0. You can also refer to MapR documentation for the latest release.

Skip to end of metadata
Go to start of metadata

As of Hue 3.8.1-1507 and Spark 1.3.1, you can configure Hue to use the Spark Notebook UI. This allows users to submit Spark jobs from Hue.   

Icon

Spark Notebook is a beta feature that utilizes the Spark REST Job Server (Livy).

Complete the following steps as the root user or by using sudo

  1. Install the mapr-hue-livy package on the node were you have installed the mapr-spark package and configured Spark.
    • On Ubuntu:

    • On RedHat/ CentOS:

      Icon
      If you do not install the mapr-hue-livy package on a node were the mapr-spark package is installed, the Livy service will not start.
  2. For Spark 1.3.1: Copy javax.servlet-api-3.1.0.jar to the spark lib directory.

     

  3. In the spark-env.sh file, configure SPARK_SUBMIT_CLASSPATH environment variable to include the classpath to the servlet jar before the MAPR_SPARK_CLASSPATH.

    Example
  4. In the [spark] section of the hue.ini, set the livy_server_host parameter to the host where the Livy server is running.

    Example
    Icon
    If the Livy server runs on the same node as the Hue UI, you are not required to set this property as the value defaults to the local host.
  5. If Spark jobs run on YARN, perform the following steps:

    1. Set livy_server_session_kind to yarn on the node where the Livy server is running.

      Example
    2.  For Hue 3.9.0: Set the HUE_HOME and the HADOOP_CONF_DIR environment variables in the hue.sh file (/opt/mapr/hue/hue-<version>/bin/hue.sh).

      Icon
      If you do not set these environment variables, the following error appears in the Check Configuration page: The app won't work without running Livy Spark Server.


  6. Restart the Spark REST Job Server (Livy).

  7. Restart Hue.

  8. Restart Spark.

Additional Information

  • To access the Notebook UI, select Spark from the Query Editor in the Hue interface.
  • If needed, you can use the MCS or maprcli to start, stop, or restart the Livy Server. For more information, see Starting, Stopping, and Restarting Services

Troubleshooting Tip

If you have more that one version of Python installed, you may see the following error when executing Python samples:
Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe...

Workaround:
Set the following environment variables in /opt/mapr/spark/spark-<version>/conf/spark-env.sh:

export PYSPARK_PYTHON=/usr/bin/python2.7
export PYSPARK_DRIVER_PYTHON=/usr/bin/python2.7

  • No labels