Hive复合数据类型怎么用

发布时间：2021-12-10 09:27:15 来源：亿速云阅读：221 作者：小新栏目：大数据

这篇文章主要介绍了Hive复合数据类型怎么用，具有一定借鉴价值，感兴趣的朋友可以参考下，希望大家阅读完这篇文章之后大有收获，下面让小编带着大家一起了解一下。

1.概述

Hive作为大数据中离线数据的存储,并把Hive作为构建数据仓库的环境，一般情况下数据仓库的数据类型都是基本数据类型如int、string、double等，但是有时候也会需要一些复合数据结构来存储数据，如array、map、struct；下面我们就分别介绍下这三种符合数据结构：

类型	定义	说明
array	Array<data_type>	array中的数据为相同类型，例如，假如array A中元素['a','b','c']，则A[1]的值为'b'
map	Map<key,value>	Map数据类型，主要是以K:V形式进行存储可以通过字段名[‘key’]进行访问，将返回这个key对应的Value
struct	STRUCT < col_name : data_type [COMMENT col_comment], …>	structs内部的数据可以通过DOT（.）来存取，例如，表中一列a的类型为STRUCT{b INT; c INT}，我们可以通过a.ba来访问域b

2.Array使用

1).新建一张学生成绩表student1,里面有id,name,score字段，score是个array数据类型，里面是学生的成绩，新建表语句：

hive>create table student1(id int,name string, score array<double>)  ROW FORMAT    DELIMITED FIELDS TERMINATED BY ',' COLLECTION ITEMS TERMINATED BY '|';

2)数据准备student1.txt

[root@salver158 ~]# cat student1.txt 100,"student1",80|82|84101,"student2",70|72|74102,"student3",60|62|64

3)加载数据

 hive>load  data local inpath "/root/student1.txt"  into  table student1;

4)加载成功，查询下看看：

hive> select  * from student1;OK100  "student1"  [80.0,82.0,84.0]101  "student2"  [70.0,72.0,74.0]102  "student3"  [60.0,62.0,64.0]Time taken: 0.612 seconds, Fetched: 3 row(s)

3.Map使用

1).新建表sudent2,字段id,name,score，其中score数据类型为Map<科目，分数>，建表语句：

hive> create table student2(id int,name string,score map<string,double>)ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' COLLECTION ITEMS TERMINATED BY '|' MAP KEYS TERMINATED BY ':';

2).数据准备student2.txt

[root@salver158 ~]# cat student2.txt 100,"student1","yuwen":80|"shuxue":82|"yingyu":84101,"student2","yuwen":70|"shuxue":72|"yingyu":74102,"student3","yuwen":60|"shuxue":62|"yingyu":64

3).数据加载

hive> load  data local inpath "/root/student2.txt"  into  table student2;

4).加载成功，查询下看看：

hive> select * from student2;OK100  "student1"  {"\"yuwen\"":80.0,"\"shuxue\"":82.0,"\"yingyu\"":84.0}101  "student2"  {"\"yuwen\"":70.0,"\"shuxue\"":72.0,"\"yingyu\"":74.0}102  "student3"  {"\"yuwen\"":60.0,"\"shuxue\"":62.0,"\"yingyu\"":64.0}Time taken: 0.124 seconds, Fetched: 3 row(s)

4.struct使用

1).新建表sudent3,字段id,name,score，其中score数据类型为struct<kecheng:string,score:double>，建表语句：

hive> create table student3(id int,name string,score struct<kecheng:string,score:double>)ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' COLLECTION ITEMS TERMINATED BY '|' ;

2).数据准备student3.txt

[root@salver158 ~]# cat student3.txt 100,"student1","yuwen"|80101,"student2","yuwen"|70102,"student3","yuwen"|60

3).数据加载

hive> load  data local inpath "/root/student3.txt"  into  table student3;

4).加载成功，查询下看看：

hive> select * from student3;OK100  "student1"  {"kecheng":"\"yuwen\"","score":80.0}101  "student2"  {"kecheng":"\"yuwen\"","score":70.0}102  "student3"  {"kecheng":"\"yuwen\"","score":60.0}Time taken: 0.091 seconds, Fetched: 3 row(s)

感谢你能够认真阅读完这篇文章，希望小编分享的“Hive复合数据类型怎么用”这篇文章对大家有帮助，同时也希望大家多多支持亿速云，关注亿速云行业资讯频道，更多相关知识等着你来学习!

向AI问一下细节

Hive复合数据类型怎么用

猜你喜欢

最新资讯

相关推荐

相关标签