关注 spark技术分享,
撸spark源码 玩spark最佳实践

Spark History Server

Spark History Server

Spark History Server is the web UI for completed and running (aka incomplete) Spark applications. It is an extension of Spark’s web UI.

spark history server webui.png
Figure 1. History Server’s web UI
Tip
Enable collecting events in your Spark applications using spark.eventLog.enabled Spark property.

You can start History Server by executing start-history-server.sh shell script and stop it using stop-history-server.sh.

start-history-server.sh accepts --properties-file [propertiesFile] command-line option that specifies the properties file with the custom Spark properties.

If not specified explicitly, Spark History Server uses the default configuration file, i.e. spark-defaults.conf.

Tip

Enable INFO logging level for org.apache.spark.deploy.history logger to see what happens inside.

Add the following line to conf/log4j.properties:

Refer to Logging.

Starting History Server — start-history-server.sh script

You can start a HistoryServer instance by executing $SPARK_HOME/sbin/start-history-server.sh script (where SPARK_HOME is the directory of your Spark installation).

Internally, start-history-server.sh script starts org.apache.spark.deploy.history.HistoryServer standalone application for execution (using spark-daemon.sh shell script).

Tip
Using the more explicit approach with spark-class to start Spark History Server could be easier to trace execution by seeing the logs printed out to the standard output and hence terminal directly.

When started, it prints out the following INFO message to the logs:

It registers signal handlers (using SignalUtils) for TERM, HUP, INT to log their execution:

It inits security if enabled (using spark.history.kerberos.enabled setting).

Caution
FIXME Describe initSecurity

It creates a SecurityManager.

It creates a HistoryServer and requests it to bind to spark.history.ui.port port.

Tip

The host’s IP can be specified using SPARK_LOCAL_IP environment variable (defaults to 0.0.0.0).

You should see the following INFO message in the logs:

It registers a shutdown hook to call stop on the HistoryServer instance.

Tip
Use stop-history-server.sh shell script to to stop a running History Server.

Stopping History Server — stop-history-server.sh script

You can stop a running instance of HistoryServer using $SPARK_HOME/sbin/stop-history-server.sh shell script.

Settings

Table 1. Spark Properties
Setting Default Value Description

spark.history.ui.port

18080

The port of the History Server’s UI.

spark.history.fs.logDirectory

file:/tmp/spark-events

The directory with the event logs. The directory has to exist before starting History Server.

spark.history.retainedApplications

50

How many Spark applications to retain.

spark.history.ui.maxApplications

(unbounded)

how many Spark applications to show in the UI.

spark.history.kerberos.enabled

false

Enable security when working with HDFS with security enabled (Kerberos).

spark.history.kerberos.principal

(empty)

Kerberos principal. Required when spark.history.kerberos.enabled is enabled.

spark.history.kerberos.keytab

(empty)

Keytab to use for login to Kerberos. Required when spark.history.kerberos.enabled is enabled.

spark.history.provider

org.apache.spark.deploy.history.FsHistoryProvider

The fully-qualified class name for a ApplicationHistoryProvider.

赞(0) 打赏
未经允许不得转载:spark技术分享 » Spark History Server
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏