数据采集是大数据分析全流程的重要环节,典型的数据采集工具包括ETL工具、日志采集工具、数据迁移工具等。
Flume是一个高可用的、高可靠的、分布式的海量日志采集、聚合和传输的系统。
1.安装Flume
下载:http://www.apache.org/dist/flume/
hadoop@dblab:/usr/local$ sudo wget http://www.apache.org/dist/flume/1.7.0/apache-flume-1.7.0-bin.tar.gz
hadoop@dblab:/usr/local$ sudo tar -zxvf apache-flume-1.7.0-bin.tar.gz
hadoop@dblab:/usr/local$ sudo mv apache-flume-1.7.0-bin ./flume
2.配置环境变量
hadoop@dblab:/usr/local$ sudo vim ~/.bashrc
export FLUME_HOME=/usr/local/flume
export FLUME_CONF_DIR=$FLUME_HOME/conf
export JAVA_HOME=/usr/lib/jvm/default-java
export PATH=$PATH:$HIVE_HOME/bin
hadoop@dblab:/usr/local$ source ~/.bashrc
hadoop@dblab:/usr/local/flume/conf$ mv flume-env.sh.template flume-env.sh
hadoop@dblab:/usr/local/flume/conf$ sudo vim flume-env.sh
#在flume-env.sh文件开头加入如下语句:
export JAVA_HOME=/usr/lib/jvm/default-java
3.启动Flume
hadoop@dblab:/usr/local/flume$ cd /usr/local/flume
hadoop@dblab:/usr/local/flume$ ./bin/flume-ng version
错误: 找不到或无法加载主类 org.apache.flume.tools.GetJavaProperty
Flume 1.7.0
Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
Revision: 511d868555dd4d16e6ce4fedc72c2d1454546707
Compiled by bessbd on Wed Oct 12 20:51:10 CEST 2016
From source with checksum 0d21b3ffdc55a07e1d08875872c00523
hadoop@dblab:/usr/local/flume$ cd /usr/local/hbase/conf
hadoop@dblab:/usr/local/hbase/conf$ sudo vim hbase-env.sh
#export HBASE_CLASSPATH=/usr/local/hadoop/conf #注释该行,即解决上述问题
hadoop@dblab:/usr/local/flume$ ./bin/flume-ng version
Flume 1.7.0
Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
Revision: 511d868555dd4d16e6ce4fedc72c2d1454546707
Compiled by bessbd on Wed Oct 12 20:51:10 CEST 2016
From source with checksum 0d21b3ffdc55a07e1d08875872c00523
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。