关注 spark技术分享,
撸spark源码 玩spark最佳实践

BlockDataManager

BlockDataManager — Block Storage Management API

BlockDataManager is the contract for managing storage for blocks of data (aka block storage management API).

Note
BlockDataManager is a private[spark] contract.
Table 1. BlockDataManager Contract
Method Description

getBlockData

Fetches a local block data by blockId

Used when:

putBlockData

Uploads a block data locally by blockId. The return value says whether the operation has succeeded (true) or failed (false).

Used when…​FIXME

releaseLock

Releases the lock for getBlockData and putBlockData methods

Used when…​FIXME

Blocks are identified by BlockId that has a globally unique identifier (name) and stored as ManagedBuffer.

Table 2. BlockIds
Name Description

RDDBlockId

Described by RDD ID (rddId) and a partition index (splitIndex)

Created when an RDD is requested to get or compute an RDD partition (identified by splitIndex).

ShuffleBlockId

Described by shuffleId, mapId and reduceId

ShuffleDataBlockId

Described by shuffleId, mapId and reduceId

ShuffleIndexBlockId

Described by shuffleId, mapId and reduceId

BroadcastBlockId

Described by broadcastId identifier and optional field

TaskResultBlockId

Described by taskId

StreamBlockId

Described by streamId and uniqueId

Note
BlockManager is the one and only known implementation of BlockDataManager Contract in Apache Spark.
赞(0) 打赏
未经允许不得转载:spark技术分享 » BlockDataManager
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏