Exercise: Causing Stage to Fail
The example shows how Spark re-executes a stage in case of stage failure.
Recipe
Start a Spark cluster, e.g. 1-node Hadoop YARN.
1 2 3 4 5 |
start-yarn.sh |
1 2 3 4 5 6 |
// 2-stage job -- it _appears_ that a stage can be failed only when there is a shuffle sc.parallelize(0 to 3e3.toInt, 2).map(n => (n % 2, n)).groupByKey.count |
Use 2 executors at least so you can kill one and keep the application up and running (on one executor).
1 2 3 4 5 6 7 |
YARN_CONF_DIR=hadoop-conf ./bin/spark-shell --master yarn \ -c spark.shuffle.service.enabled=true \ --num-executors 2 |