关注 spark技术分享,
撸spark源码 玩spark最佳实践

SessionCatalog — Session-Scoped Catalog of Relational Entities

SessionCatalog — Session-Scoped Catalog of Relational Entities

SessionCatalog is the catalog (registry) of relational entities, i.e. databases, tables, views, partitions, and functions (in a SparkSession).

spark sql SessionCatalog.png
Figure 1. SessionCatalog and Spark SQL Services

SessionCatalog uses the ExternalCatalog for the metadata of permanent entities (i.e. tables).

Note
SessionCatalog is a layer over ExternalCatalog in a SparkSession which allows for different metastores (i.e. in-memory or hive) to be used.

SessionCatalog is available through SessionState (of a SparkSession).

SessionCatalog is created when BaseSessionStateBuilder is requested for the SessionCatalog (when SessionState is requested for it).

Amongst the notable usages of SessionCatalog is to create an Analyzer or a SparkOptimizer.

Table 1. SessionCatalog’s Internal Properties (e.g. Registries, Counters and Flags)
Name Description

currentDb

FIXME

Used when…​FIXME

tableRelationCache

A cache of fully-qualified table names to table relation plans (i.e. LogicalPlan).

Used when SessionCatalog refreshes a table

tempViews

Registry of temporary views (i.e. non-global temporary tables)

requireTableExists Internal Method

requireTableExists…​FIXME

Note
requireTableExists is used when…​FIXME

databaseExists Method

databaseExists…​FIXME

Note
databaseExists is used when…​FIXME

listTables Method

  1. Uses "*" as the pattern

listTables…​FIXME

Note

listTables is used when:

  • ShowTablesCommand logical command is requested to run

  • SessionCatalog is requested to reset (for testing)

  • CatalogImpl is requested to listTables (for testing)

Checking Whether Table Is Temporary View — isTemporaryTable Method

isTemporaryTable…​FIXME

Note
isTemporaryTable is used when…​FIXME

alterPartitions Method

alterPartitions…​FIXME

Note
alterPartitions is used when…​FIXME

listPartitions Method

listPartitions…​FIXME

Note
listPartitions is used when…​FIXME

alterTable Method

alterTable…​FIXME

Note
alterTable is used when AlterTableSetPropertiesCommand, AlterTableUnsetPropertiesCommand, AlterTableChangeColumnCommand, AlterTableSerDePropertiesCommand, AlterTableRecoverPartitionsCommand, AlterTableSetLocationCommand, AlterViewAsCommand (for permanent views) logical commands are executed.

Altering Table Statistics in Metastore (and Invalidating Internal Cache) — alterTableStats Method

alterTableStats requests ExternalCatalog to alter the statistics of the table (per identifier) followed by invalidating the table relation cache.

alterTableStats reports a NoSuchDatabaseException if the database does not exist.

alterTableStats reports a NoSuchTableException if the table does not exist.

Note

alterTableStats is used when the following logical commands are executed:

tableExists Method

tableExists…​FIXME

Note
tableExists is used when…​FIXME

functionExists Method

functionExists…​FIXME

Note

functionExists is used in:

listFunctions Method

listFunctions…​FIXME

Note
listFunctions is used when…​FIXME

Invalidating Table Relation Cache (aka Refreshing Table) — refreshTable Method

refreshTable…​FIXME

Note
refreshTable is used when…​FIXME

loadFunctionResources Method

loadFunctionResources…​FIXME

Note
loadFunctionResources is used when…​FIXME

Altering (Updating) Temporary View (Logical Plan) — alterTempViewDefinition Method

alterTempViewDefinition alters the temporary view by updating an in-memory temporary table (when a database is not specified and the table has already been registered) or a global temporary table (when a database is specified and it is for global temporary tables).

Note
“Temporary table” and “temporary view” are synonyms.

alterTempViewDefinition returns true when an update could be executed and finished successfully.

Note
alterTempViewDefinition is used exclusively when AlterViewAsCommand logical command is executed.

Creating (Registering) Or Replacing Local Temporary View — createTempView Method

createTempView…​FIXME

Note
createTempView is used when…​FIXME

Creating (Registering) Or Replacing Global Temporary View — createGlobalTempView Method

createGlobalTempView simply requests the GlobalTempViewManager to create a global temporary view.

Note

createGlobalTempView is used when:

  • CreateViewCommand logical command is executed (for a global temporary view, i.e. when the view type is GlobalTempView)

  • CreateTempViewUsing logical command is executed (for a global temporary view, i.e. when the global flag is on)

createTable Method

createTable…​FIXME

Note
createTable is used when…​FIXME

Creating SessionCatalog Instance

SessionCatalog takes the following when created:

SessionCatalog initializes the internal registries and counters.

Finding Function by Name (Using FunctionRegistry) — lookupFunction Method

lookupFunction finds a function by name.

For a function with no database defined that exists in FunctionRegistry, lookupFunction requests FunctionRegistry to find the function (by its unqualified name, i.e. with no database).

If the name function has the database defined or does not exist in FunctionRegistry, lookupFunction uses the fully-qualified function name to check if the function exists in FunctionRegistry (by its fully-qualified name, i.e. with a database).

For other cases, lookupFunction requests ExternalCatalog to find the function and loads its resources. It then creates a corresponding temporary function and looks up the function again.

Note

lookupFunction is used when:

Finding Relation (Table or View) in Catalogs — lookupRelation Method

lookupRelation finds the name table in the catalogs (i.e. GlobalTempViewManager, ExternalCatalog or registry of temporary views) and gives a SubqueryAlias per table type.

Internally, lookupRelation looks up the name table using:

  1. GlobalTempViewManager when the database name of the table matches the name of GlobalTempViewManager

    1. Gives SubqueryAlias or reports a NoSuchTableException

  2. ExternalCatalog when the database name of the table is specified explicitly or the registry of temporary views does not contain the table

    1. Gives SubqueryAlias with View when the table is a view (aka temporary table)

    2. Gives SubqueryAlias with UnresolvedCatalogRelation otherwise

  3. The registry of temporary views

    1. Gives SubqueryAlias with the logical plan per the table as registered in the registry of temporary views

Note
lookupRelation considers default to be the name of the database if the name table does not specify the database explicitly.
Note

lookupRelation is used when:

Retrieving Table Metadata from External Catalog (Metastore) — getTableMetadata Method

getTableMetadata simply requests external catalog (metastore) for the table metadata.

Before requesting the external metastore, getTableMetadata makes sure that the database and table (of the input TableIdentifier) both exist. If either does not exist, getTableMetadata reports a NoSuchDatabaseException or NoSuchTableException, respectively.

Retrieving Table Metadata — getTempViewOrPermanentTableMetadata Method

Internally, getTempViewOrPermanentTableMetadata branches off per database.

When a database name is not specified, getTempViewOrPermanentTableMetadata finds a local temporary view and creates a CatalogTable (with VIEW table type and an undefined storage) or retrieves the table metadata from an external catalog.

With the database name of the GlobalTempViewManager, getTempViewOrPermanentTableMetadata requests GlobalTempViewManager for the global view definition and creates a CatalogTable (with the name of GlobalTempViewManager in table identifier, VIEW table type and an undefined storage) or reports a NoSuchTableException.

With the database name not of GlobalTempViewManager, getTempViewOrPermanentTableMetadata simply retrieves the table metadata from an external catalog.

Note

getTempViewOrPermanentTableMetadata is used when:

Reporting NoSuchDatabaseException When Specified Database Does Not Exist — requireDbExists Internal Method

requireDbExists reports a NoSuchDatabaseException if the specified database does not exist. Otherwise, requireDbExists does nothing.

reset Method

reset…​FIXME

Note
reset is used exclusively in the Spark SQL internal tests.

Dropping Global Temporary View — dropGlobalTempView Method

dropGlobalTempView simply requests the GlobalTempViewManager to remove the name global temporary view.

Note
dropGlobalTempView is used when…​FIXME

Dropping Table — dropTable Method

dropTable…​FIXME

Note

dropTable is used when:

Getting Global Temporary View (Definition) — getGlobalTempView Method

getGlobalTempView…​FIXME

Note
getGlobalTempView is used when…​FIXME

registerFunction Method

registerFunction…​FIXME

Note

registerFunction is used when:

  • SessionCatalog is requested to lookupFunction

  • HiveSessionCatalog is requested to lookupFunction0

  • CreateFunctionCommand logical command is executed

lookupFunctionInfo Method

lookupFunctionInfo…​FIXME

Note
lookupFunctionInfo is used when…​FIXME

alterTableDataSchema Method

alterTableDataSchema…​FIXME

Note
alterTableDataSchema is used when…​FIXME
赞(0) 打赏
未经允许不得转载:spark技术分享 » SessionCatalog — Session-Scoped Catalog of Relational Entities
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏