在Hive中,可以使用CompressionCodec
来对导出的数据进行压缩。以下是一些常用的压缩编解码器及其用法:
STORED AS TEXTFILE
和COMPRESSED BY 'org.apache.hadoop.hive.ql.io.SnappyCodec'
。例如:CREATE TABLE example_table (
id INT,
name STRING
)
STORED AS TEXTFILE
COMPRESSED BY 'org.apache.hadoop.hive.ql.io.SnappyCodec';
STORED AS TEXTFILE
和COMPRESSED BY 'org.apache.hadoop.hive.ql.io.LzoCodec'
。例如:CREATE TABLE example_table (
id INT,
name STRING
)
STORED AS TEXTFILE
COMPRESSED BY 'org.apache.hadoop.hive.ql.io.LzoCodec';
STORED AS TEXTFILE
和COMPRESSED BY 'org.apache.hadoop.hive.ql.io.GzipCodec'
。例如:CREATE TABLE example_table (
id INT,
name STRING
)
STORED AS TEXTFILE
COMPRESSED BY 'org.apache.hadoop.hive.ql.io.GzipCodec';
STORED AS TEXTFILE
和COMPRESSED BY 'org.apache.hadoop.hive.ql.io.Bzip2Codec'
。例如:CREATE TABLE example_table (
id INT,
name STRING
)
STORED AS TEXTFILE
COMPRESSED BY 'org.apache.hadoop.hive.ql.io.Bzip2Codec';
在导出数据时,可以使用SELECT ... INTO OUTFILE
语句将数据导出到本地文件系统或HDFS,并指定压缩编解码器。例如:
SELECT id, name
FROM example_table
INTO OUTFILE '/path/to/output/file'
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
COMPRESSED BY 'org.apache.hadoop.hive.ql.io.SnappyCodec';
这将把example_table
中的数据导出到一个本地文件,并使用Snappy压缩。
亿速云「云服务器」,即开即用、新一代英特尔至强铂金CPU、三副本存储NVMe SSD云盘,价格低至29元/月。点击查看>>
推荐阅读:hive数据导出如何进行数据压缩