温馨提示×

温馨提示×

您好,登录后才能下订单哦!

密码登录×
登录注册×
其他方式登录
点击 登录注册 即表示同意《亿速云用户服务条款》

hive如何整合phoenix

发布时间:2021-12-10 09:43:14 来源:亿速云 阅读:483 作者:小新 栏目:云计算

这篇文章主要为大家展示了“hive如何整合phoenix”,内容简而易懂,条理清晰,希望能够帮助大家解决疑惑,下面让小编带领大家一起研究并学习一下“hive如何整合phoenix”这篇文章吧。

首先需要phoenix整合hbase

hive整合hbase,此处参照之前的笔记

将phoenix{core,queryserver,4.8.0-HBase-0.98,hive}拷贝到$hive/lib/

根据官网要求修改配置文件

> vim conf/hive-env.sh

hive如何整合phoenix

> vim conf/hive-site.xml

hive如何整合phoenix

启动:

> hive -hiveconf phoenix.zookeeper.quorum=hadoop01:2181

创建内部表

create table phoenix_table (

s1 string,

i1 int,

f1 float,

d1 double

)

STORED BY 'org.apache.phoenix.hive.PhoenixStorageHandler'

TBLPROPERTIES (

"phoenix.table.name" = "phoenix_table",

"phoenix.zookeeper.quorum" = "hadoop01",

"phoenix.zookeeper.znode.parent" = "/hbase",

"phoenix.zookeeper.client.port" = "2181",

"phoenix.rowkeys" = "s1, i1",

"phoenix.column.mapping" = "s1:s1, i1:i1, f1:f1, d1:d1",

"phoenix.table.options" = "SALT_BUCKETS=10, DATA_BLOCK_ENCODING='DIFF'"

);

创建成功。查询phoenix和hbase中都有相应的表生成:phoenix

hive如何整合phoenix

hbase:

hive如何整合phoenix

属性

  1. phoenix.table.name


    • phoenix指定表名

    • 默认值:hive一样的表

  1. phoenix.zookeeper.quorum


    • 指定ZK地址

    • 默认值:localhost

  1. phoenix.zookeeper.znode.parent


    • 指定HBase在ZK的目录

    • 默认值:/ hbase

  1. phoenix.zookeeper.client.port


    • 指定ZK端口

    • 默认值:2181

  1. phoenix.rowkeys


    • 指定phoenix的rowkey,即hbase的rowkey

    • 要求

  1. phoenix.column.mapping


    • hive与phoenix之间的列映射。

插入数据

使用hive测试表pokes导入数据

> insert into table phoenix_table select bar,foo,12.3 as fl,22.2 as dl from pokes;

成功、查询

hive如何整合phoenix

在phoenix中查询

hive如何整合phoenix

还可以使用phoenix导入数据,看官网的解释

hive如何整合phoenix

hive如何整合phoenix

注意:phoenix4.8认为加tbale关键字为语法错误,其他版本没试,不知道官网怎么没说明

hive如何整合phoenix

创建外部表

For external tables Hive works with an existing Phoenix table and manages only Hive metadata. Deleting an external table from Hive only deletes Hive metadata and keeps Phoenix table

首先在phoenix创建表

phoenix> create table PHOENIX_TABLE_EXT(aa varchar not null primary key,bb varchar);

再在hive中创建外部表:

create external table phoenix_table_ext_1 ( aa string, bb string ) STORED BY 'org.apache.phoenix.hive.PhoenixStorageHandler' TBLPROPERTIES ( "phoenix.table.name" = "phoenix_table_ext ", "phoenix.zookeeper.quorum" = "hadoop01", "phoenix.zookeeper.znode.parent" = "/hbase", "phoenix.zookeeper.client.port" = "2181", "phoenix.rowkeys" = "aa", "phoenix.column.mapping" = "aa:aa, bb:bb" );

创建成功,插入成功

这些选项可以设置在hive CLI

性能调优

参数默认值描述
phoenix.upsert.batch.size1000批量大小插入。
[phoenix-table-name].disable.walfalse它暂时设置表属性DISABLE_WAL = true。可用于提高性能
[phoenix-table-name].auto.flushfalse当WAL是disabled 的flush又为真,则按文件刷进库

查询数据

可以使用HiveQL在phoenix表查询数据。一个简单表查询当hive.fetch.task.conversion=more and hive.exec.parallel=true.就可以像在Phoenix CLI一样快。

参数默认值描述
hbase.scan.cache100为一个单位请求读取行大小。
hbase.scan.cacheblockfalse是否缓存块。
split.by.statsfalseIf true, mappers will use table statistics. One mapper per guide post.
[hive-table-name].reducer.count1reducer的数量. In tez mode is affected only single-table query. See Limitations
[phoenix-table-name].query.hint Hint for phoenix query (like NO_INDEX)

遇到的问题:

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org.apache.hadoop.hbase.client.Scan.isReversed()Z

最开始我用的hbase-0.96.2-hadoop2版本,不能整合,这个是需要hbase-client-0.98.21-hadoop2.jar包,更换这个jar包就解决了,但是还是会报下面的错

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:ERROR 103 (08004): Unable to establish connection.

于是更换了hbase的版本为0.98.21的 ok了

---------

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.StringIndexOutOfBoundsException: String index out of range: -1

因为字段对应不一样

create table phoenix_table_3 (a string,b int) STORED BY 'org.apache.phoenix.hive.PhoenixStorageHandler' TBLPROPERTIES ("phoenix.table.name" = "phoenix_table_3","phoenix.zookeeper.quorum" = "hadoop01","phoenix.zookeeper.znode.parent" = "/hbase","phoenix.zookeeper.client.port" = "2181","phoenix.rowkeys" = "a1","phoenix.column.mapping" = "a:a1, b:b1","phoenix.table.options" = "SALT_BUCKETS=10, DATA_BLOCK_ENCODING='DIFF'");

hive表字段与phoenix字段一样就可以了

----------

创建成功,插入也能成功,就是hive查询的时候报错找不到a1列,因为phoenix是aa列

Failed with exception java.io.IOException:java.lang.RuntimeException: org.apache.phoenix.schema.ColumnNotFoundException: ERROR 504 (42703): Undefined column. columnName=A1

create external table phoenix_table_ext (a1 string,b1 string)STORED BY 'org.apache.phoenix.hive.PhoenixStorageHandler' TBLPROPERTIES ("phoenix.table.name" = "phoenix_table_ext","phoenix.zookeeper.quorum" = "hadoop01","phoenix.zookeeper.znode.parent" = "/hbase","phoenix.zookeeper.client.port" = "2181","phoenix.rowkeys" = "aa","phoenix.column.mapping" = "a1:aa, b1:bb");

解决办法:同上hive表字段与phoenix字段一样就可以了

以上是“hive如何整合phoenix”这篇文章的所有内容,感谢各位的阅读!相信大家都有了一定的了解,希望分享的内容对大家有所帮助,如果还想学习更多知识,欢迎关注亿速云行业资讯频道!

向AI问一下细节

免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。

AI