关注 spark技术分享,
撸spark源码 玩spark最佳实践

CommitLog

CommitLog — HDFSMetadataLog for Batch Completion Log

CommitLog is a HDFSMetadataLog with metadata as regular text (i.e. String).

Note
HDFSMetadataLog is a MetadataLog that uses Hadoop HDFS for a reliable storage.

CommitLog is created along with StreamExecution.

add Method

add…​FIXME

Note
add is used when…​FIXME

add Method

Note
add is part of MetadataLog Contract to…​FIXME.

add…​FIXME

serialize Method

Note
serialize is part of HDFSMetadataLog Contract to write a metadata in serialized format.

serialize writes out the version prefixed with v on a single line (e.g. v1) followed by the empty JSON (i.e. {}).

Note
The version in Spark 2.2 is 1 with the charset being UTF-8.
Note
serialize always writes an empty JSON as the name of the files gives the meaning.

deserialize Method

Note
deserialize is part of HDFSMetadataLog Contract to…​FIXME.

deserialize…​FIXME

Creating CommitLog Instance

CommitLog takes the following when created:

  • SparkSession

  • Path of the metadata log directory

赞(0) 打赏
未经允许不得转载:spark技术分享 » CommitLog
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏