CommitLog — HDFSMetadataLog for Batch Completion Log
CommitLog is a HDFSMetadataLog with metadata as regular text (i.e. String).
|
Note
|
HDFSMetadataLog is a MetadataLog that uses Hadoop HDFS for a reliable storage.
|
CommitLog is created along with StreamExecution.
add Method
|
1 2 3 4 5 |
add(batchId: Long, metadata: String): Boolean |
|
Note
|
add is part of MetadataLog Contract to…FIXME.
|
add…FIXME
serialize Method
|
1 2 3 4 5 |
serialize(metadata: String, out: OutputStream): Unit |
|
Note
|
serialize is part of HDFSMetadataLog Contract to write a metadata in serialized format.
|
serialize writes out the version prefixed with v on a single line (e.g. v1) followed by the empty JSON (i.e. {}).
|
Note
|
The version in Spark 2.2 is 1 with the charset being UTF-8. |
|
Note
|
serialize always writes an empty JSON as the name of the files gives the meaning.
|
|
1 2 3 4 5 6 7 8 9 10 |
$ ls -tr [checkpoint-directory]/commits 0 1 2 3 4 5 6 7 8 9 $ cat [checkpoint-directory]/commits/8 v1 {} |
deserialize Method
|
1 2 3 4 5 |
deserialize(in: InputStream): String |
|
Note
|
deserialize is part of HDFSMetadataLog Contract to…FIXME.
|
deserialize…FIXME
spark技术分享