这篇文章将为大家详细讲解有关怎么进行SparkSQL部署与简单使用,文章内容质量较高,因此小编分享给大家做个参考,希望大家阅读完这篇文章后对相关知识有一定的了解。
Ø JDK:1.8.0_45 64位
Ø hadoop-2.6.0-cdh6.7.0
Ø Scala:2.11.8
Ø spark-2.3.1-bin-2.6.0-cdh6.7.0(需要自己编译)
Ø hive-1.1.0-cdh6.7.0
Ø MySQL5.6
#元数据存在MySQL,启动MySQL
[root@hadoop001 ~]# su mysqladmin [mysqladmin@hadoop001 root]$ cd ~ [mysqladmin@hadoop001 ~]$ service mysql start Starting MySQL [ OK ]
#启动HDFS
[hadoop@hadoop001 sbin]$ ./start-dfs.sh
#配置SparkSQL 的hive-site.xml
[hadoop@hadoop001 ~]$ cp $HIVE_HOME/conf/hive-site.xml $SPARK_HOME/conf/
#spark-sehll方式启动:
[hadoop@hadoop001 bin]$ ./spark-shell --master local[2] \ --jars ~/software/mysql-connector-java-5.1.34-bin.jar scala> spark.sql("use hive_data2").show(false) scala> spark.sql("select * from emp").show(false) +-----+------+---------+----+----------+-------+------+------+ |empno|ename |job |mgr |hiredate |salary |comm |deptno| +-----+------+---------+----+----------+-------+------+------+ |7369 |SMITH |CLERK |7902|1980-12-17|800.0 |null |20 | |7499 |ALLEN |SALESMAN |7698|1981-2-20 |1600.0 |300.0 |30 | |7521 |WARD |SALESMAN |7698|1981-2-22 |1250.0 |500.0 |30 | |7566 |JONES |MANAGER |7839|1981-4-2 |2975.0 |null |20 | |7654 |MARTIN|SALESMAN |7698|1981-9-28 |1250.0 |1400.0|30 | |7698 |BLAKE |MANAGER |7839|1981-5-1 |2850.0 |null |30 | |7782 |CLARK |MANAGER |7839|1981-6-9 |2450.0 |null |10 | |7788 |SCOTT |ANALYST |7566|1987-4-19 |3000.0 |null |20 | |7839 |KING |PRESIDENT|null|1981-11-17|5000.0 |null |10 | |7844 |TURNER|SALESMAN |7698|1981-9-8 |1500.0 |0.0 |30 | |7876 |ADAMS |CLERK |7788|1987-5-23 |1100.0 |null |20 | |7900 |JAMES |CLERK |7698|1981-12-3 |950.0 |null |30 | |7902 |FORD |ANALYST |7566|1981-12-3 |3000.0 |null |20 | |7934 |MILLER|CLERK |7782|1982-1-23 |1300.0 |null |10 | |8888 |HIVE |PROGRAM |7839|1988-1-23 |10300.0|null |null | +-----+------+---------+----+----------+-------+------+------+
#spark-sql方式启动:
[hadoop@hadoop001 bin]$ ./spark-sql --master local[2] \ --driver-class-path ~/software/mysql-connector-java-5.1.34-bin.jar #进入数据库 spark-sql> use hive_data2; 18/08/30 20:36:52 INFO HiveMetaStore: 0: get_database: hive_data2 18/08/30 20:36:52 INFO audit: ugi=hadoop ip=unknown-ip-addr cmd=get_database: hive_data2 Time taken: 0.114 seconds #查询数据 spark-sql> select * from emp; 18/08/30 20:37:05 INFO DAGScheduler: Job 0 finished: processCmd at CliDriver.java:376, took 1.292944 s 7369 SMITH CLERK 7902 1980-12-17 800.0 NULL 20 7499 ALLEN SALESMAN 7698 1981-2-20 1600.0 300.0 30 7521 WARD SALESMAN 7698 1981-2-22 1250.0 500.0 30 7566 JONES MANAGER 7839 1981-4-2 2975.0 NULL 20 7654 MARTIN SALESMAN 7698 1981-9-28 1250.0 1400.0 30 7698 BLAKE MANAGER 7839 1981-5-1 2850.0 NULL 30 7782 CLARK MANAGER 7839 1981-6-9 2450.0 NULL 10 7788 SCOTT ANALYST 7566 1987-4-19 3000.0 NULL 20 7839 KING PRESIDENT NULL 1981-11-17 5000.0 NULL 10 7844 TURNER SALESMAN 7698 1981-9-8 1500.0 0.0 30 7876 ADAMS CLERK 7788 1987-5-23 1100.0 NULL 20 7900 JAMES CLERK 7698 1981-12-3 950.0 NULL 30 7902 FORD ANALYST 7566 1981-12-3 3000.0 NULL 20 7934 MILLER CLERK 7782 1982-1-23 1300.0 NULL 10 8888 HIVE PROGRAM 7839 1988-1-23 10300.0 NULL NULL
关于怎么进行SparkSQL部署与简单使用就分享到这里了,希望以上内容可以对大家有一定的帮助,可以学到更多知识。如果觉得文章不错,可以把它分享出去让更多的人看到。
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。