TaskResults — DirectTaskResult and IndirectTaskResult
TaskResult models a task result. It has exactly two concrete implementations:
-
DirectTaskResult is the
TaskResultto be serialized and sent over the wire to the driver together with the result bytes and accumulators. -
IndirectTaskResult is the
TaskResultthat is just a pointer to a task result in aBlockManager.
The decision of the concrete TaskResult is made when a TaskRunner finishes running a task and checks the size of the result.
|
Note
|
The types are private[spark].
|
DirectTaskResult Task Result
|
1 2 3 4 5 6 7 8 |
DirectTaskResult[T]( var valueBytes: ByteBuffer, var accumUpdates: Seq[AccumulatorV2[_, _]]) extends TaskResult[T] with Externalizable |
DirectTaskResult is the TaskResult of running a task (that is later returned serialized to the driver) when the size of the task’s result is smaller than spark.driver.maxResultSize and spark.task.maxDirectResultSize (or spark.rpc.message.maxSize whatever is smaller).
|
Note
|
DirectTaskResult is Java’s java.io.Externalizable.
|
IndirectTaskResult Task Result
|
1 2 3 4 5 6 |
IndirectTaskResult[T](blockId: BlockId, size: Int) extends TaskResult[T] with Serializable |
IndirectTaskResult is a TaskResult that…
|
Note
|
IndirectTaskResult is Java’s java.io.Serializable.
|
spark技术分享