关注 spark技术分享,
撸spark源码 玩spark最佳实践

UnsafeFixedWidthAggregationMap

UnsafeFixedWidthAggregationMap

UnsafeFixedWidthAggregationMap is a tiny layer (extension) around Spark Core’s BytesToBytesMap to allow for UnsafeRow keys and values.

Whenever requested for performance metrics (i.e. average number of probes per key lookup and peak memory used), UnsafeFixedWidthAggregationMap simply requests the underlying BytesToBytesMap.

UnsafeFixedWidthAggregationMap is created when:

Table 1. UnsafeFixedWidthAggregationMap’s Internal Properties (e.g. Registries, Counters and Flags)
Name Description

currentAggregationBuffer

Re-used pointer (as an UnsafeRow with the number of fields to match the aggregationBufferSchema) to the current aggregation buffer

Used exclusively when UnsafeFixedWidthAggregationMap is requested to getAggregationBufferFromUnsafeRow.

emptyAggregationBuffer

Empty aggregation buffer (encoded in UnsafeRow format)

groupingKeyProjection

UnsafeProjection for the groupingKeySchema (to encode grouping keys as UnsafeRows)

map

Spark Core’s BytesToBytesMap with the taskMemoryManager, initialCapacity, pageSizeBytes and performance metrics enabled

supportsAggregationBufferSchema Static Method

supportsAggregationBufferSchema is a predicate that is enabled (true) unless there is a field (in the fields of the input schema) whose data type is not mutable.

Note
supportsAggregationBufferSchema is used exclusively when HashAggregateExec is requested to supportsAggregate.

Creating UnsafeFixedWidthAggregationMap Instance

UnsafeFixedWidthAggregationMap takes the following when created:

  • Empty aggregation buffer (as an InternalRow)

  • Aggregation buffer schema

  • Grouping key schema

  • Spark Core’s TaskMemoryManager

  • Initial capacity

  • Page size (in bytes)

UnsafeFixedWidthAggregationMap initializes the internal registries and counters.

getAggregationBufferFromUnsafeRow Method

  1. Uses the hash code of the key

getAggregationBufferFromUnsafeRow requests the BytesToBytesMap to lookup the input key (to get a BytesToBytesMap.Location).

getAggregationBufferFromUnsafeRow…​FIXME

Note

getAggregationBufferFromUnsafeRow is used when:

  • TungstenAggregationIterator is requested to processInputs (exclusively when TungstenAggregationIterator is created)

  • (for testing only) UnsafeFixedWidthAggregationMap is requested to getAggregationBuffer

getAggregationBuffer Method

getAggregationBuffer…​FIXME

Note
getAggregationBuffer seems to be used exclusively for testing.

Getting KVIterator — iterator Method

iterator…​FIXME

Note

iterator is used when:

  • HashAggregateExec physical operator is requested to finishAggregate

  • TungstenAggregationIterator is created (and pre-loads the first key-value pair from the map)

getPeakMemoryUsedBytes Method

getPeakMemoryUsedBytes…​FIXME

Note

getPeakMemoryUsedBytes is used when:

getAverageProbesPerLookup Method

getAverageProbesPerLookup…​FIXME

Note

getAverageProbesPerLookup is used when:

free Method

free…​FIXME

Note

free is used when:

destructAndCreateExternalSorter Method

destructAndCreateExternalSorter…​FIXME

Note

destructAndCreateExternalSorter is used when:

赞(0) 打赏
未经允许不得转载:spark技术分享 » UnsafeFixedWidthAggregationMap
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏