关注 spark技术分享,
撸spark源码 玩spark最佳实践

ExternalAppendOnlyUnsafeRowArray — Append-Only Array for UnsafeRows (with Disk Spill Threshold)

ExternalAppendOnlyUnsafeRowArray — Append-Only Array for UnsafeRows (with Disk Spill Threshold)

ExternalAppendOnlyUnsafeRowArray is an append-only array for UnsafeRows that spills content to disk when a predefined spill threshold of rows is reached.

Note
Choosing a proper spill threshold of rows is a performance optimization.

ExternalAppendOnlyUnsafeRowArray is created when:

  • WindowExec physical operator is executed (and creates an internal buffer for window frames)

  • WindowFunctionFrame is prepared

  • SortMergeJoinExec physical operator is executed (and creates a RowIterator for INNER and CROSS joins) and for getBufferedMatches

  • SortMergeJoinScanner creates an internal bufferedMatches

  • UnsafeCartesianRDD is computed

Table 1. ExternalAppendOnlyUnsafeRowArray’s Internal Registries and Counters
Name Description

initialSizeOfInMemoryBuffer

FIXME

Used when…​FIXME

inMemoryBuffer

FIXME

Can grow up to numRowsSpillThreshold rows (i.e. new UnsafeRows are added)

Used when…​FIXME

spillableArray

UnsafeExternalSorter

Used when…​FIXME

numRows

Used when…​FIXME

modificationsCount

Used when…​FIXME

numFieldsPerRow

Used when…​FIXME

Tip

Enable INFO logging level for org.apache.spark.sql.execution.ExternalAppendOnlyUnsafeRowArray logger to see what happens inside.

Add the following line to conf/log4j.properties:

Refer to Logging.

generateIterator Method

Caution
FIXME

add Method

Caution
FIXME
Note

add is used when:

clear Method

Caution
FIXME

Creating ExternalAppendOnlyUnsafeRowArray Instance

ExternalAppendOnlyUnsafeRowArray takes the following when created:

ExternalAppendOnlyUnsafeRowArray initializes the internal registries and counters.

赞(0) 打赏
未经允许不得转载:spark技术分享 » ExternalAppendOnlyUnsafeRowArray — Append-Only Array for UnsafeRows (with Disk Spill Threshold)
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏