AbstractSqlParser — Base SQL Parsing Infrastructure
AbstractSqlParser is the base of ParserInterfaces that use an AstBuilder to parse SQL statements and convert them to Spark SQL entities, i.e. DataType, StructType, Expression, LogicalPlan and TableIdentifier.
AbstractSqlParser is the foundation of the SQL parsing infrastructure.
|
1 2 3 4 5 6 7 8 9 10 11 |
package org.apache.spark.sql.catalyst.parser abstract class AbstractSqlParser extends ParserInterface { // only required properties (vals and methods) that have no implementation // the others follow def astBuilder: AstBuilder } |
| Method | Description |
|---|---|
|
AstBuilder for parsing SQL statements. Used in all the |
| Name | Description | ||
|---|---|---|---|
|
The default SQL parser in SessionState available as
|
|||
|
Creates a DataType or a StructType (schema) from their canonical string representation. |
Setting Up SqlBaseLexer and SqlBaseParser for Parsing — parse Method
|
1 2 3 4 5 |
parse[T](command: String)(toResult: SqlBaseParser => T): T |
parse sets up a proper ANTLR parsing infrastructure with SqlBaseLexer and SqlBaseParser (which are the ANTLR-specific classes of Spark SQL that are auto-generated at build time from the SqlBase.g4 grammar).
|
Tip
|
Review the definition of ANTLR grammar for Spark SQL in sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4. |
Internally, parse first prints out the following INFO message to the logs:
|
1 2 3 4 5 |
INFO SparkSqlParser: Parsing command: [command] |
|
Tip
|
Enable INFO logging level for the custom AbstractSqlParser, i.e. SparkSqlParser or CatalystSqlParser, to see the above INFO message.
|
parse then creates and sets up a SqlBaseLexer and SqlBaseParser that in turn passes the latter on to the input toResult function where the parsing finally happens.
|
Note
|
parse uses SLL prediction mode for parsing first before falling back to LL mode.
|
In case of parsing errors, parse reports a ParseException.
|
Note
|
parse is used in all the parse methods, i.e. parseDataType, parseExpression, parsePlan, parseTableIdentifier, and parseTableSchema.
|
parseDataType Method
|
1 2 3 4 5 |
parseDataType(sqlText: String): DataType |
|
Note
|
parseDataType is part of ParserInterface Contract to…FIXME.
|
parseDataType…FIXME
parseExpression Method
|
1 2 3 4 5 |
parseExpression(sqlText: String): Expression |
|
Note
|
parseExpression is part of ParserInterface Contract to…FIXME.
|
parseExpression…FIXME
parseFunctionIdentifier Method
|
1 2 3 4 5 |
parseFunctionIdentifier(sqlText: String): FunctionIdentifier |
|
Note
|
parseFunctionIdentifier is part of ParserInterface Contract to…FIXME.
|
parseFunctionIdentifier…FIXME
parseTableIdentifier Method
|
1 2 3 4 5 |
parseTableIdentifier(sqlText: String): TableIdentifier |
|
Note
|
parseTableIdentifier is part of ParserInterface Contract to…FIXME.
|
parseTableIdentifier…FIXME
parseTableSchema Method
|
1 2 3 4 5 |
parseTableSchema(sqlText: String): StructType |
|
Note
|
parseTableSchema is part of ParserInterface Contract to…FIXME.
|
parseTableSchema…FIXME
parsePlan Method
|
1 2 3 4 5 |
parsePlan(sqlText: String): LogicalPlan |
|
Note
|
parsePlan is part of ParserInterface Contract to…FIXME.
|
parsePlan creates a LogicalPlan for a given SQL textual statement.
Internally, parsePlan builds a SqlBaseParser and requests AstBuilder to parse a single SQL statement.
If a SQL statement could not be parsed, parsePlan throws a ParseException:
|
1 2 3 4 5 |
Unsupported SQL statement |
spark技术分享