这篇文章将为大家详细讲解有关Hive中matadata怎么用,小编觉得挺实用的,因此分享给大家做个参考,希望大家阅读完这篇文章后可以有所收获。
Hive元数据存储在MySQL库里,数据存储HDFS上;查看元数据库存放地址,查看Hive配置文件路径查看:
$HIVE_HOME/conf/hive-site.xml
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive_data?createDatabaseIfNotExist=true&characterEncoding=latin1</value>
</property>
查看MySQL元数据:
mysql> use hive_data
Database changed
mysql> show tables;
+---------------------------+
| Tables_in_hive_data |
+---------------------------+
| bucketing_cols |
| cds |
| columns_v2 |
| database_params |
| db_privs |
| dbs |
| func_ru |
| funcs |
| global_privs |
| idxs |
| index_params |
| part_col_privs |
| part_col_stats |
| part_privs |
| partition_key_vals |
| partition_keys |
| partition_params |
| partitions |
| roles |
| sd_params |
| sds |
| sequence_table |
| serde_params |
| serdes |
| skewed_col_names |
| skewed_col_value_loc_map |
| skewed_string_list |
| skewed_string_list_values |
| skewed_values |
| sort_cols |
| tab_col_stats |
| table_params |
| tbl_col_privs |
| tbl_privs |
| tbls |
| version |
+---------------------------+
----------------------------------------------------------------
1. Hive版本version表有且只有一条数据,多一条数据Hive会无法启动
mysql> select * from version;
+--------+----------------+---------------------------------------+
| VER_ID | SCHEMA_VERSION | VERSION_COMMENT |
+--------+----------------+---------------------------------------+
| 1 | 1.1.0 | Set by MetaStore hadoop@192.168.0.129 |
+--------+----------------+---------------------------------------+
2.Hive数据库元数据表【dbs】【database_params】
mysql> select DB_ID,DB_LOCATION_URI,NAME from dbs
+-------+----------------------------------------------------------------+---------------+
| db_id | DB_LOCATION_URI | NAME |
+-------+----------------------------------------------------------------+---------------+
| 1 | hdfs://192.168.0.129:9000/user/hive/warehouse | default |
| 3 | hdfs://192.168.0.129:9000/user/hive/warehouse/hive_data2.db | hive_data2 |
| 6 | hdfs://192.168.0.129:9000/user/hive/warehouse/ruozedata_job.db | ruozedata_job |
+-------+----------------------------------------------------------------+---------------+
DB_ID -- 数据库ID 【tbls】
DB_LOCATION_URI -- HDFD存放路径
NAME -- 数据库名
3.Hive表内容、结构、属性
mysql> select TBL_ID,CREATE_TIME,DB_ID,SD_ID,TBL_NAME,TBL_TYPE from tbls;
+--------+-------------+-------+-------+---------------+----------------+
| TBL_ID | CREATE_TIME | DB_ID | SD_ID | TBL_NAME | TBL_TYPE |
+--------+-------------+-------+-------+---------------+----------------+
| 7 | 1528299941 | 3 | 7 | emp | EXTERNAL_TABLE |
| 10 | 1528311773 | 3 | 10 | emp_bak | MANAGED_TABLE |
| 11 | 1528312267 | 3 | 11 | emp1 | EXTERNAL_TABLE |
| 16 | 1528403085 | 3 | 16 | dual | MANAGED_TABLE |
| 17 | 1528484818 | 3 | 17 | json | MANAGED_TABLE |
| 22 | 1529454293 | 3 | 22 | emp_partition | MANAGED_TABLE |
| 26 | 1529459118 | 3 | 31 | emp_sqoop111 | MANAGED_TABLE |
| 34 | 1529530688 | 6 | 39 | user_click | EXTERNAL_TABLE |
| 38 | 1529537107 | 6 | 44 | product_info | EXTERNAL_TABLE |
| 39 | 1529593387 | 6 | 45 | city_info | MANAGED_TABLE |
| 41 | 1529606647 | 6 | 46 | product_hot | MANAGED_TABLE |
+--------+-------------+-------+-------+---------------+----------------+
TBL_ID -- 表ID 【table_params】【partitions】【partition_keys】
DB_ID -- 库ID
SD_ID -- 序列化ID 【sds】【partitions】
4.Hive文件存储相关元数据:【sds】【serdes】【serde_params】
mysql> select * from sds;
CD_ID --表列ID 【columns_v2】【cds】
SERDE_ID --序列化列ID
【serdes】【serde_params】
mysql> select * from columns_v2;
+-------+---------+--------------+-----------+-------------+
| CD_ID | COMMENT | COLUMN_NAME | TYPE_NAME | INTEGER_IDX |
+-------+---------+--------------+-----------+-------------+
| 7 | NULL | comm | double | 6 |
| 7 | NULL | deptno | int | 7 |
| 7 | NULL | empno | int | 0 |
| 7 | NULL | ename | string | 1 |
| 7 | NULL | hiredate | string | 4 |
| 7 | NULL | job | string | 2 |
| 7 | NULL | mgr | int | 3 |
| 7 | NULL | salary | double | 5 |
| 10 | NULL | comm | double | 6 |
| 10 | NULL | deptno | int | 7 |
| 10 | NULL | empno | int | 0 |
| 10 | NULL | ename | string | 1 |
| 10 | NULL | hiredate | string | 4 |
| 10 | NULL | job | string | 2 |
| 10 | NULL | mgr | int | 3 |
| 10 | NULL | salary | double | 5 |
5.Hive表分区相关的元数据表
mysql> select * from partitions;
+---------+-------------+------------------+-----------------+-------+--------+
| PART_ID | CREATE_TIME | LAST_ACCESS_TIME | PART_NAME | SD_ID | TBL_ID |
+---------+-------------+------------------+-----------------+-------+--------+
| 1 | 1529456274 | 0 | pt=2018-06-19 | 26 | 22 |
| 6 | 1529530704 | 0 | data=2018-06-20 | 40 | 34 |
PART_ID -- 分区ID 【partition_key_vasls】【partition_params】
mysql> select * from partition_key_vals;
+---------+--------------+-------------+
| PART_ID | PART_KEY_VAL | INTEGER_IDX |
+---------+--------------+-------------+
| 1 | 2018-06-19 | 0 |
| 6 | 2018-06-20 | 0 |
+---------+--------------+-------------+
PART_ID -- 分区ID
PART_KEY_VAL -- 分区字段值
mysql> select * from partition_params;
+---------+-----------------------+-------------+
| PART_ID | PARAM_KEY | PARAM_VALUE |
+---------+-----------------------+-------------+
| 1 | COLUMN_STATS_ACCURATE | true |
| 1 | numFiles | 2 |
| 1 | numRows | 0 |
| 1 | rawDataSize | 0 |
| 1 | totalSize | 734 |
| 1 | transient_lastDdlTime | 1529456274 |
| 6 | COLUMN_STATS_ACCURATE | true |
| 6 | numFiles | 1 |
| 6 | numRows | 0 |
| 6 | rawDataSize | 0 |
| 6 | totalSize | 725264 |
| 6 | transient_lastDdlTime | 1529530704 |
PARAM_KEY -- 分区属性
PARAM_VALUE -- 分区属性值
总结:
关于“Hive中matadata怎么用”这篇文章就分享到这里了,希望以上内容可以对大家有一定的帮助,使各位可以学到更多知识,如果觉得文章不错,请把它分享出去让更多的人看到。
亿速云「云服务器」,即开即用、新一代英特尔至强铂金CPU、三副本存储NVMe SSD云盘,价格低至29元/月。点击查看>>
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。
原文链接:http://blog.itpub.net/31441024/viewspace-2168691/