关注 spark技术分享,
撸spark源码 玩spark最佳实践

StorageStatusListener Spark Listener

StorageStatusListener — Spark Listener for Tracking BlockManagers

StorageStatusListener is a SparkListener that uses SparkListener callbacks to track status of every BlockManager in a Spark application.

StorageStatusListener is created and registered when SparkUI is created. It is later used to create ExecutorsListener and StorageListener Spark listeners.

Table 1. StorageStatusListener’s SparkListener Callbacks (in alphabetical order)
Callback Description

onBlockManagerAdded

Adds an executor id with StorageStatus (with BlockManager and maximum memory on the executor) to executorIdToStorageStatus internal registry.

Removes any other BlockManager that may have been registered for the executor earlier in deadExecutorStorageStatus internal registry.

onBlockManagerRemoved

Removes an executor from executorIdToStorageStatus internal registry and adds the removed StorageStatus to deadExecutorStorageStatus internal registry.

Removes the oldest StorageStatus when the number of entries in deadExecutorStorageStatus is bigger than spark.ui.retainedDeadExecutors.

onBlockUpdated

Updates StorageStatus for an executor in executorIdToStorageStatus internal registry, i.e. removes a block for NONE storage level and updates otherwise.

onUnpersistRDD

Removes the RDD blocks for an unpersisted RDD (on every BlockManager registered as StorageStatus in executorIdToStorageStatus internal registry).

Table 2. StorageStatusListener’s Internal Registries and Counters
Name Description

deadExecutorStorageStatus

Collection of StorageStatus of removed/inactive BlockManagers.

Accessible using deadStorageStatusList method.

Adds an element when StorageStatusListener handles a BlockManager being removed (possibly removing one element from the head when the number of elements are above spark.ui.retainedDeadExecutors property).

Removes an element when StorageStatusListener handles a new BlockManager (per executor) so the executor is not longer dead.

executorIdToStorageStatus

Lookup table of StorageStatus per executor (including the driver).

Adds an entry when StorageStatusListener handles a new BlockManager.

Removes an entry when StorageStatusListener handles a BlockManager being removed.

Updates StorageStatus of an executor when StorageStatusListener handles StorageStatus updates.

Updating Storage Status For Executor — updateStorageStatus Method

Caution
FIXME

Active BlockManagers (on Executors) — storageStatusList Method

storageStatusList gives a collection of StorageStatus (from executorIdToStorageStatus internal registry).

Note

storageStatusList is used when:

deadStorageStatusList Method

deadStorageStatusList gives deadExecutorStorageStatus internal registry.

Note
deadStorageStatusList is used when ExecutorsListener is requested for inactive/dead BlockManagers.

Removing RDD Blocks for Unpersisted RDD — updateStorageStatus Internal Method

updateStorageStatus takes active BlockManagers.

updateStorageStatus then finds RDD blocks for unpersistedRDDId RDD (for every BlockManager) and removes the blocks.

Note
storageStatusList is used exclusively when StorageStatusListener is notified that an RDD was unpersisted.
赞(0) 打赏
未经允许不得转载:spark技术分享 » StorageStatusListener Spark Listener
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏