关注 spark技术分享,
撸spark源码 玩spark最佳实践

CacheManager — In-Memory Cache for Tables and Views

CacheManager — In-Memory Cache for Tables and Views

CacheManager is an in-memory cache for tables and views (as logical plans). It uses the internal cachedData collection of CachedData to track logical plans and their cached InMemoryRelation representation.

CacheManager is shared across SparkSessions through SharedState.

Note
A Spark developer can use CacheManager to cache Datasets using cache or persist operators.

Cached Queries — cachedData Internal Registry

cachedData is a collection of CachedData with logical plans and their cached InMemoryRelation representation.

cachedData is cleared when…​FIXME

invalidateCachedPath Method

Caution
FIXME

invalidateCache Method

Caution
FIXME

lookupCachedData Method

Caution
FIXME

uncacheQuery Method

Caution
FIXME

isEmpty Method

Caution
FIXME

Caching Dataset (Registering Analyzed Logical Plan as InMemoryRelation) — cacheQuery Method

cacheQuery adds the analyzed logical plan of the input query to the cachedData internal registry of cached queries.

Internally, cacheQuery firstly requests the input query for the analyzed logical plan and creates a InMemoryRelation with the following properties:

cacheQuery then creates a CachedData (for the analyzed query plan and the InMemoryRelation) and adds it to the cachedData internal registry.

If the input query has already been cached, cacheQuery simply prints the following WARN message to the logs and exits (i.e. does nothing but printing out the WARN message):

Note

cacheQuery is used when:

Removing All Cached Tables From In-Memory Cache — clearCache Method

clearCache acquires a write lock and unpersists RDD[CachedBatch]s of the queries in cachedData before removing them altogether.

Note
clearCache is used when the CatalogImpl is requested to clearCache.

CachedData

Caution
FIXME

recacheByCondition Internal Method

recacheByCondition…​FIXME

Note
recacheByCondition is used when CacheManager is requested to recacheByPlan or recacheByPath.

recacheByPlan Method

recacheByPlan…​FIXME

Note
recacheByPlan is used exclusively when InsertIntoDataSourceCommand logical command is executed.

recacheByPath Method

recacheByPath…​FIXME

Note
recacheByPath is used exclusively when CatalogImpl is requested to refreshByPath.

Replacing Logical Query Segments With Cached Query Plans — useCachedData Method

useCachedData…​FIXME

Note
useCachedData is used exclusively when QueryExecution is requested for a cached logical query plan.
赞(0) 打赏
未经允许不得转载:spark技术分享 » CacheManager — In-Memory Cache for Tables and Views
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏