CommitLog — HDFSMetadataLog for Batch Completion Log
CommitLog
is a HDFSMetadataLog with metadata as regular text (i.e. String
).
Note
|
HDFSMetadataLog is a MetadataLog that uses Hadoop HDFS for a reliable storage.
|
CommitLog
is created along with StreamExecution.
add
Method
1 2 3 4 5 |
add(batchId: Long, metadata: String): Boolean |
Note
|
add is part of MetadataLog Contract to…FIXME.
|
add
…FIXME
serialize
Method
1 2 3 4 5 |
serialize(metadata: String, out: OutputStream): Unit |
Note
|
serialize is part of HDFSMetadataLog Contract to write a metadata in serialized format.
|
serialize
writes out the version prefixed with v
on a single line (e.g. v1
) followed by the empty JSON (i.e. {}
).
Note
|
The version in Spark 2.2 is 1 with the charset being UTF-8. |
Note
|
serialize always writes an empty JSON as the name of the files gives the meaning.
|
1 2 3 4 5 6 7 8 9 10 |
$ ls -tr [checkpoint-directory]/commits 0 1 2 3 4 5 6 7 8 9 $ cat [checkpoint-directory]/commits/8 v1 {} |
deserialize
Method
1 2 3 4 5 |
deserialize(in: InputStream): String |
Note
|
deserialize is part of HDFSMetadataLog Contract to…FIXME.
|
deserialize
…FIXME