关注 spark技术分享,
撸spark源码 玩spark最佳实践

CompressionCodecs

CompressionCodecs

CompressionCodecs is a utility object…​FIXME

Table 1. Known Compression Codecs
Alias Fully-Qualified Class Name

none

uncompressed

bzip2

org.apache.hadoop.io.compress.BZip2Codec

deflate

org.apache.hadoop.io.compress.DeflateCodec

gzip

org.apache.hadoop.io.compress.GzipCodec

lz4

org.apache.hadoop.io.compress.Lz4Codec

snappy

org.apache.hadoop.io.compress.SnappyCodec

setCodecConfiguration Method

setCodecConfiguration sets compression-related configurations to the Hadoop Configuration per the input codec.

Note
The input codec should be a fully-qualified class name, i.e. org.apache.hadoop.io.compress.SnappyCodec.

If the input codec is defined (i.e. not null), setCodecConfiguration sets the following configuration properties.

Table 2. Compression-Related Hadoop Configuration Properties (codec defined)
Name Value

mapreduce.output.fileoutputformat.compress

true

mapreduce.output.fileoutputformat.compress.type

BLOCK

mapreduce.output.fileoutputformat.compress.codec

The input codec name

mapreduce.map.output.compress

true

mapreduce.map.output.compress.codec

The input codec name

If the input codec is not defined (i.e. null), setCodecConfiguration sets the following configuration properties.

Table 3. Compression-Related Hadoop Configuration Properties (codec not defined)
Name Value

mapreduce.output.fileoutputformat.compress

false

mapreduce.map.output.compress

false

Note
setCodecConfiguration is used when CSVFileFormat, JsonFileFormat and TextFileFormat are requested to prepareWrite.
赞(0) 打赏
未经允许不得转载:spark技术分享 » CompressionCodecs
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏