DataSource — Pluggable Data Source
DataSource
is…FIXME
DataSource
is created when…FIXME
Tip
|
Read DataSource — Pluggable Data Sources (for Spark SQL’s batch structured queries). |
Name | Description |
---|---|
|
java.lang.Class that corresponds to the className (that can be a fully-qualified class name or an alias of the data source) |
|
Used when:
|
Describing Name and Schema of Streaming Source — sourceSchema
Internal Method
1 2 3 4 5 |
sourceSchema(): SourceInfo |
sourceSchema
…FIXME
Note
|
sourceSchema is used exclusively when DataSource is requested SourceInfo.
|
Creating DataSource Instance
DataSource
takes the following when created:
DataSource
initializes the internal registries and counters.
createSource
Method
1 2 3 4 5 |
createSource(metadataPath: String): Source |
createSource
…FIXME
Note
|
createSource is used when…FIXME
|
Creating Streaming Sink — createSink
Method
1 2 3 4 5 |
createSink(outputMode: OutputMode): Sink |
createSink
creates a streaming sink for StreamSinkProvider or FileFormat
data sources.
Tip
|
Find out more on FileFormat data sources in FileFormat — Data Sources to Read and Write Data In Files section in The Internals of Spark SQL book.
|
Internally, createSink
creates a new instance of the providingClass and branches off per type:
-
For a StreamSinkProvider,
createSink
simply delegates the call and requests it to create a streaming sink -
For a
FileFormat
,createSink
creates a FileStreamSink whenpath
option is specified and the output mode is Append
createSink
throws a IllegalArgumentException
when path
option is not specified for a FileFormat
data source:
1 2 3 4 5 |
'path' is not specified |
createSink
throws an AnalysisException
when the given OutputMode is different from Append for a FileFormat
data source:
1 2 3 4 5 |
Data source [className] does not support [outputMode] output mode |
createSink
throws an UnsupportedOperationException
for unsupported data source formats:
1 2 3 4 5 |
Data source [className] does not support streamed writing |
Note
|
createSink is used exclusively when DataStreamWriter is requested to create and start a streaming query.
|