关注 spark技术分享,
撸spark源码 玩spark最佳实践

CompressionCodec

CompressionCodec

With spark.broadcast.compress enabled (which is the default), TorrentBroadcast uses compression for broadcast blocks.

Caution
FIXME What’s compressed?
Table 1. Built-in Compression Codecs
Codec Alias Fully-Qualified Class Name Notes

lz4

org.apache.spark.io.LZ4CompressionCodec

The default implementation

lzf

org.apache.spark.io.LZFCompressionCodec

snappy

org.apache.spark.io.SnappyCompressionCodec

The fallback when the default codec is not available.

An implementation of CompressionCodec trait has to offer a constructor that accepts a single argument being SparkConf. Read Creating CompressionCodec — createCodec Factory Method in this document.

You can control the default compression codec in a Spark application using spark.io.compression.codec Spark property.

Creating CompressionCodec — createCodec Factory Method

createCodec uses the internal shortCompressionCodecNames lookup table to find the input codecName (regardless of the case).

createCodec finds the constructor of the compression codec’s implementation (that accepts a single argument being SparkConf).

If a compression codec could not be found, createCodec throws a IllegalArgumentException exception:

getCodecName Method

getCodecName reads spark.io.compression.codec Spark property from the input conf SparkConf or assumes lz4.

Settings

Table 2. Settings
Name Default value Description

spark.io.compression.codec

lz4

The compression codec to use.

Used when getCodecName is called to find the current compression codec.

赞(0) 打赏
未经允许不得转载:spark技术分享 » CompressionCodec
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏