关注 spark技术分享,
撸spark源码 玩spark最佳实践

StorageStatus

StorageStatus

StorageStatus is a developer API that Spark uses to pass “just enough” information about registered BlockManagers in a Spark application between Spark services (mostly for monitoring purposes like web UI or SparkListeners).

Note

There are two ways to access StorageStatus about all the known BlockManagers in a Spark application:

StorageStatus is created when:

Table 1. StorageStatus’s Internal Registries and Counters
Name Description

_nonRddBlocks

Lookup table of BlockIds per BlockId.

Used when…​FIXME

_rddBlocks

Lookup table of BlockIds with BlockStatus per RDD id.

Used when…​FIXME

updateStorageInfo Method

Caution
FIXME

Creating StorageStatus Instance

StorageStatus takes the following when created:

StorageStatus initializes the internal registries and counters.

Getting RDD Blocks For RDD — rddBlocksById Method

rddBlocksById gives the blocks (as BlockId with their status as BlockStatus) that belong to rddId RDD.

Note

rddBlocksById is used when:

Removing Block (From Internal Registries) — removeBlock Internal Method

removeBlock removes blockId from _rddBlocks registry and returns it.

Internally, removeBlock updates block status of blockId (to be empty, i.e. removed).

removeBlock branches off per the type of BlockId, i.e. RDDBlockId or not.

For a RDDBlockId, removeBlock finds the RDD in _rddBlocks and removes the blockId. removeBlock removes the RDD (from _rddBlocks) completely, if there are no more blocks registered.

For a non-RDDBlockId, removeBlock removes blockId from _nonRddBlocks registry.

Note
removeBlock is used when StorageStatusListener removes RDD blocks for an unpersisted RDD or updates storage status for an executor.
赞(0) 打赏
未经允许不得转载:spark技术分享 » StorageStatus
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏