关注 spark技术分享,
撸spark源码 玩spark最佳实践

UnifiedMemoryManager — Spark’s Memory Manager

UnifiedMemoryManager — Spark’s Memory Manager

UnifiedMemoryManager is the default MemoryManager with onHeapStorageMemory being ??? and onHeapExecutionMemory being ???

Calculate Maximum Memory to Use — getMaxMemory Method

getMaxMemory calculates the maximum memory to use for execution and storage.

getMaxMemory reads the maximum amount of memory that the Java virtual machine will attempt to use and decrements it by reserved system memory (for non-storage and non-execution purposes).

getMaxMemory makes sure that the following requirements are met:

  1. System memory is not smaller than about 1,5 of the reserved system memory.

  2. spark.executor.memory is not smaller than about 1,5 of the reserved system memory.

Ultimately, getMaxMemory returns spark.memory.fraction of the maximum amount of memory for the JVM (minus the reserved system memory).

Caution
FIXME omnigraffle it.

Creating UnifiedMemoryManager Instance

UnifiedMemoryManager requires a SparkConf and the following values:

  • maxHeapMemory — the maximum on-heap memory to manage. It is assumed that onHeapExecutionMemoryPool with onHeapStorageMemoryPool is exactly maxHeapMemory.

  • onHeapStorageRegionSize

  • numCores

UnifiedMemoryManager makes sure that the sum of offHeapExecutionMemoryPool and offHeapStorageMemoryPool pool sizes is exactly maxOffHeapMemory.

Caution
FIXME Describe the pools

apply Factory Method

Internally, apply calculates the maximum memory to use (given conf). It then creates a UnifiedMemoryManager with the following values:

  1. maxHeapMemory being the maximum memory just calculated.

  2. onHeapStorageRegionSize being spark.memory.storageFraction of maximum memory.

  3. numCores as configured.

Note
apply is used when SparkEnv is created.

acquireStorageMemory Method

Note
acquireStorageMemory is part of the MemoryManager Contract to…​FIXME

acquireStorageMemory has two modes of operation per memoryMode, i.e. MemoryMode.ON_HEAP or MemoryMode.OFF_HEAP, for execution and storage pools, and the maximum amount of memory to use.

Caution
FIXME Where are they used?

In MemoryMode.ON_HEAP, onHeapExecutionMemoryPool, onHeapStorageMemoryPool, and maxOnHeapStorageMemory are used.

In MemoryMode.OFF_HEAP, offHeapExecutionMemoryPool, offHeapStorageMemoryPool, and maxOffHeapMemory are used.

Caution
FIXME What is the difference between them?

It makes sure that the requested number of bytes numBytes (for a block to store) fits the available memory. If it is not the case, you should see the following INFO message in the logs and the method returns false.

If the requested number of bytes numBytes is greater than memoryFree in the storage pool, acquireStorageMemory will attempt to use the free memory from the execution pool.

Note
The storage pool can use the free memory from the execution pool.

It will take as much memory as required to fit numBytes from memoryFree in the execution pool (up to the whole free memory in the pool).

Ultimately, acquireStorageMemory requests the storage pool for numBytes for blockId.

Note

acquireStorageMemory is used when MemoryStore acquires storage memory to putBytes or putIteratorAsValues and putIteratorAsBytes.

It is also used internally when UnifiedMemoryManager acquires unroll memory.

acquireUnrollMemory Method

Note
acquireUnrollMemory is part of the MemoryManager Contract.

acquireUnrollMemory simply forwards all the calls to acquireStorageMemory.

acquireExecutionMemory Method

acquireExecutionMemory does…​FIXME

Internally, acquireExecutionMemory varies per MemoryMode, i.e. ON_HEAP and OFF_HEAP.

Table 1. acquireExecutionMemory and MemoryMode
ON_HEAP OFF_HEAP

executionPool

onHeapExecutionMemoryPool

offHeapExecutionMemoryPool

storagePool

onHeapStorageMemoryPool

offHeapStorageMemoryPool

storageRegionSize

onHeapStorageRegionSize <1>

offHeapStorageMemory

maxMemory

maxHeapMemory <2>

maxOffHeapMemory

Note
acquireExecutionMemory is part of the MemoryManager Contract.
Caution
FIXME

maxOnHeapStorageMemory Method

maxOnHeapStorageMemory is the difference between maxHeapMemory of the UnifiedMemoryManager and the memory currently in use in onHeapExecutionMemoryPool execution memory pool.

Note
maxOnHeapStorageMemory is part of the MemoryManager Contract.

Settings

Table 2. Spark Properties
Spark Property Default Value Description

spark.memory.fraction

0.6

Fraction of JVM heap space used for execution and storage.

spark.memory.storageFraction

0.5

spark.testing.memory

Java’s Runtime.getRuntime.maxMemory

System memory

spark.testing.reservedMemory

300M or 0 (with spark.testing enabled)

赞(0) 打赏
未经允许不得转载:spark技术分享 » UnifiedMemoryManager — Spark’s Memory Manager
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏