关注 spark技术分享,
撸spark源码 玩spark最佳实践

DiskBlockObjectWriter

DiskBlockObjectWriter

Whenever DiskBlockObjectWriter is requested to write a key-value pair, it makes sure that the underlying output streams are open.

DiskBlockObjectWriter can be in the following states (that match the state of the underlying output streams):

  1. Initialized

  2. Open

  3. Closed

Table 1. DiskBlockObjectWriter’s Internal Registries and Counters
Name Description

initialized

Internal flag…​FIXME

Used when…​FIXME

hasBeenClosed

Internal flag…​FIXME

Used when…​FIXME

streamOpen

Internal flag…​FIXME

Used when…​FIXME

objOut

FIXME

Used when…​FIXME

mcs

FIXME

Used when…​FIXME

bs

FIXME

Used when…​FIXME

objOut

FIXME

Used when…​FIXME

blockId

FIXME

Used when…​FIXME

Note
DiskBlockObjectWriter is a private[spark] class.

updateBytesWritten Method

Caution
FIXME

initialize Method

Caution
FIXME

Writing Bytes (From Byte Array Starting From Offset) — write Method

write…​FIXME

Caution
FIXME

recordWritten Method

Caution
FIXME

commitAndGet Method

Note
commitAndGet is used when…​FIXME

close Method

Caution
FIXME

Creating DiskBlockObjectWriter Instance

DiskBlockObjectWriter takes the following when created:

  1. file

  2. serializerManager — SerializerManager

  3. serializerInstance — SerializerInstance

  4. bufferSize

  5. syncWrites flag

  6. writeMetrics — ShuffleWriteMetrics

  7. blockId — BlockId

DiskBlockObjectWriter initializes the internal registries and counters.

Writing Key-Value Pair — write Method

Before writing, write opens the stream unless already open.

write then writes the key first followed by writing the value.

In the end, write recordWritten.

Note
write is used when BypassMergeSortShuffleWriter writes records and in ExternalAppendOnlyMap, ExternalSorter and WritablePartitionedPairCollection.

Opening DiskBlockObjectWriter — open Method

open opens DiskBlockObjectWriter, i.e. initializes and re-sets bs and objOut internal output streams.

Internally, open makes sure that DiskBlockObjectWriter is not closed (i.e. hasBeenClosed flag is disabled). If it was, open throws a IllegalStateException:

Unless DiskBlockObjectWriter has already been initialized (i.e. initialized flag is enabled), open initializes it (and turns initialized flag on).

Regardless of whether DiskBlockObjectWriter was already initialized or not, open requests SerializerManager to wrap mcs output stream for encryption and compression (for blockId) and sets it as bs.

Note
open uses SerializerManager that was specified when DiskBlockObjectWriter was created
Note
open uses SerializerInstance that was specified when DiskBlockObjectWriter was created

In the end, open turns streamOpen flag on.

Note
open is used exclusively when DiskBlockObjectWriter writes a key-value pair or bytes from a specified byte array but the stream is not open yet.
赞(0) 打赏
未经允许不得转载:spark技术分享 » DiskBlockObjectWriter
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏