关注 spark技术分享,
撸spark源码 玩spark最佳实践

KafkaOffsetReader

KafkaOffsetReader

KafkaOffsetReader is used to query a Kafka cluster for partition offsets.

KafkaOffsetReader is created when:

When requested for the human-readable text representation (aka toString), KafkaOffsetReader simply requests the ConsumerStrategy for one.

Table 1. KafkaOffsetReader’s Options
Name Default Value Description

fetchOffset.numRetries

3

fetchOffset.retryIntervalMs

1000

How long to wait before retries.

Table 2. KafkaOffsetReader’s Internal Registries and Counters
Name Description

consumer

Kafka’s Consumer (with keys and values of Array[Byte] type)

Initialized when KafkaOffsetReader is created.

Used when KafkaOffsetReader:

execContext

groupId

kafkaReaderThread

maxOffsetFetchAttempts

nextId

offsetFetchAttemptIntervalMs

Tip

Enable INFO or DEBUG logging levels for org.apache.spark.sql.kafka010.KafkaOffsetReader to see what happens inside.

Add the following line to conf/log4j.properties:

Refer to Logging.

Creating Kafka Consumer — createConsumer Internal Method

Note
createConsumer is used when KafkaOffsetReader is created (and initializes consumer) and resetConsumer

Creating KafkaOffsetReader Instance

KafkaOffsetReader takes the following when created:

  • ConsumerStrategy

  • Kafka parameters (as Map[String, Object])

  • Reader options (as Map[String, String])

  • Prefix for the group id

KafkaOffsetReader initializes the internal registries and counters.

close Method

close…​FIXME

Note
close is used when…​FIXME

fetchEarliestOffsets Method

fetchEarliestOffsets…​FIXME

Note
fetchEarliestOffsets is used when…​FIXME

fetchEarliestOffsets Method

fetchEarliestOffsets…​FIXME

Note
fetchEarliestOffsets is used when…​FIXME

fetchLatestOffsets Method

fetchLatestOffsets…​FIXME

Note
fetchLatestOffsets is used when…​FIXME

Fetching (and Pausing) Assigned Kafka TopicPartitions — fetchTopicPartitions Method

fetchTopicPartitions uses an UninterruptibleThread thread to do the following:

  1. Requests the Kafka Consumer to poll (fetch data) for the topics and partitions (with 0 timeout)

  2. Requests the Kafka Consumer to get the set of partitions currently assigned

  3. Requests the Kafka Consumer to suspend fetching from the partitions assigned

In the end, fetchTopicPartitions returns the TopicPartitions assigned (and paused).

Note
fetchTopicPartitions is used exclusively when KafkaRelation is requested to build a distributed data scan with column pruning (as a TableScan) through getPartitionOffsets.

nextGroupId Internal Method

nextGroupId…​FIXME

Note
nextGroupId is used when…​FIXME

resetConsumer Internal Method

resetConsumer…​FIXME

Note
resetConsumer is used when…​FIXME

runUninterruptibly Internal Method

runUninterruptibly…​FIXME

Note
runUninterruptibly is used when…​FIXME

withRetriesWithoutInterrupt Internal Method

withRetriesWithoutInterrupt…​FIXME

Note
withRetriesWithoutInterrupt is used when…​FIXME
赞(0) 打赏
未经允许不得转载:spark技术分享 » KafkaOffsetReader
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏