关注 spark技术分享,
撸spark源码 玩spark最佳实践

BlockManagerSlaveEndpoint

BlockManagerSlaveEndpoint

BlockManagerSlaveEndpoint is a thread-safe RPC endpoint for remote communication between executors and the driver.

Caution
FIXME the intro needs more love.

While a BlockManager is being created so is the BlockManagerSlaveEndpoint RPC endpoint with the name BlockManagerEndpoint[randomId] to handle RPC messages.

Tip

Enable DEBUG logging level for org.apache.spark.storage.BlockManagerSlaveEndpoint logger to see what happens inside.

Add the following line to conf/log4j.properties:

Refer to Logging.

RemoveBlock Message

When a RemoveBlock message comes in, you should see the following DEBUG message in the logs:

Note
Handling RemoveBlock messages happens on a separate thread. See BlockManagerSlaveEndpoint Thread Pool.

When the computation is successful, you should see the following DEBUG in the logs:

And true response is sent back. You should see the following DEBUG in the logs:

In case of failure, you should see the following ERROR in the logs and the stack trace.

RemoveRdd Message

When a RemoveRdd message comes in, you should see the following DEBUG message in the logs:

Note
Handling RemoveRdd messages happens on a separate thread. See BlockManagerSlaveEndpoint Thread Pool.

When the computation is successful, you should see the following DEBUG in the logs:

And the number of blocks removed is sent back. You should see the following DEBUG in the logs:

In case of failure, you should see the following ERROR in the logs and the stack trace.

RemoveShuffle Message

When a RemoveShuffle message comes in, you should see the following DEBUG message in the logs:

If MapOutputTracker was given (when the RPC endpoint was created), it calls MapOutputTracker to unregister the shuffleId shuffle.

Note
Handling RemoveShuffle messages happens on a separate thread. See BlockManagerSlaveEndpoint Thread Pool.

When the computation is successful, you should see the following DEBUG in the logs:

And the result is sent back. You should see the following DEBUG in the logs:

In case of failure, you should see the following ERROR in the logs and the stack trace.

Note
RemoveShuffle is posted when BlockManagerMaster and BlockManagerMasterEndpoint remove all blocks for a shuffle.

RemoveBroadcast Message

When a RemoveBroadcast message comes in, you should see the following DEBUG message in the logs:

Note
Handling RemoveBroadcast messages happens on a separate thread. See BlockManagerSlaveEndpoint Thread Pool.

When the computation is successful, you should see the following DEBUG in the logs:

And the result is sent back. You should see the following DEBUG in the logs:

In case of failure, you should see the following ERROR in the logs and the stack trace.

GetBlockStatus Message

When a GetBlockStatus message comes in, it responds with the result of calling BlockManager about the status of blockId.

GetMatchingBlockIds Message

When received a GetMatchingBlockIds, BlockManagerSlaveEndpoint requests BlockManager to find IDs of existing blocks for a given filter and sends them back.

TriggerThreadDump Message

When a TriggerThreadDump message comes in, a thread dump is generated and sent back.

BlockManagerSlaveEndpoint Thread Pool

BlockManagerSlaveEndpoint uses block-manager-slave-async-thread-pool daemon thread pool (asyncThreadPool) for some messages to talk to other Spark services, i.e. BlockManager, MapOutputTracker, ShuffleManager in a non-blocking, asynchronous way.

The reason for the async thread pool is that the block-related operations might take quite some time and to release the main RPC thread other threads are spawned to talk to the external services and pass responses on to the clients.

Note
BlockManagerSlaveEndpoint uses Java’s java.util.concurrent.ThreadPoolExecutor.
赞(0) 打赏
未经允许不得转载:spark技术分享 » BlockManagerSlaveEndpoint
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏