KafkaOffsetReader
KafkaOffsetReader
is used to query a Kafka cluster for partition offsets.
KafkaOffsetReader
is created when:
-
KafkaRelation
is requested to build a distributed data scan with column pruning (as a TableScan) (to get the initial partition offsets) -
(Spark Structured Streaming)
KafkaSourceProvider
is requested tocreateSource
andcreateContinuousReader
When requested for the human-readable text representation (aka toString
), KafkaOffsetReader
simply requests the ConsumerStrategy for one.
Name | Default Value | Description |
---|---|---|
|
||
|
How long to wait before retries. |
Name | Description |
---|---|
|
Kafka’s Consumer (with keys and values of Initialized when Used when |
|
|
|
|
|
|
|
|
|
|
|
Tip
|
Enable Add the following line to
Refer to Logging. |
Creating Kafka Consumer — createConsumer
Internal Method
1 2 3 4 5 |
createConsumer(): Consumer[Array[Byte], Array[Byte]] |
createConsumer
requests the ConsumerStrategy to create a Kafka Consumer with driverKafkaParams and new generated group.id Kafka property.
Note
|
createConsumer is used when KafkaOffsetReader is created (and initializes consumer) and resetConsumer
|
Creating KafkaOffsetReader Instance
KafkaOffsetReader
takes the following when created:
KafkaOffsetReader
initializes the internal registries and counters.
fetchEarliestOffsets
Method
1 2 3 4 5 |
fetchEarliestOffsets(): Map[TopicPartition, Long] |
fetchEarliestOffsets
…FIXME
Note
|
fetchEarliestOffsets is used when…FIXME
|
fetchEarliestOffsets
Method
1 2 3 4 5 |
fetchEarliestOffsets(newPartitions: Seq[TopicPartition]): Map[TopicPartition, Long] |
fetchEarliestOffsets
…FIXME
Note
|
fetchEarliestOffsets is used when…FIXME
|
fetchLatestOffsets
Method
1 2 3 4 5 |
fetchLatestOffsets(): Map[TopicPartition, Long] |
fetchLatestOffsets
…FIXME
Note
|
fetchLatestOffsets is used when…FIXME
|
Fetching (and Pausing) Assigned Kafka TopicPartitions — fetchTopicPartitions
Method
1 2 3 4 5 |
fetchTopicPartitions(): Set[TopicPartition] |
fetchTopicPartitions
uses an UninterruptibleThread thread to do the following:
-
Requests the Kafka Consumer to poll (fetch data) for the topics and partitions (with
0
timeout) -
Requests the Kafka Consumer to get the set of partitions currently assigned
-
Requests the Kafka Consumer to suspend fetching from the partitions assigned
In the end, fetchTopicPartitions
returns the TopicPartitions
assigned (and paused).
Note
|
fetchTopicPartitions is used exclusively when KafkaRelation is requested to build a distributed data scan with column pruning (as a TableScan) through getPartitionOffsets.
|
nextGroupId
Internal Method
1 2 3 4 5 |
nextGroupId(): String |
nextGroupId
…FIXME
Note
|
nextGroupId is used when…FIXME
|
resetConsumer
Internal Method
1 2 3 4 5 |
resetConsumer(): Unit |
resetConsumer
…FIXME
Note
|
resetConsumer is used when…FIXME
|