本篇内容介绍了“hadoop-2.6.2 lzo的配置过程”的有关知识,在实际案例的操作过程中,不少人都会遇到这样的困境,接下来就让小编带领大家学习一下如何处理这些情况吧!希望大家仔细阅读,能够学有所成!
集群有三台主机,主机名分别是:bi10,bi12,bi13。我们的操作都在bi10上面进行。
安装lzo需要一些依赖包,如果你已经安装过了,那么可以跳过这一步。首先你需要切换到root用户下
yum install gcc gcc-c++ kernel-devel yum install git
除了以上两个之外,你还需要配置maven环境,下载之后直接解压并配置环境变量即可使用
wget http://apache.fayea.com/maven/maven-3/3.3.9/binaries/apache-maven-3.3.9-bin.tar.gz tar -xzf apache-maven-3.3.9-bin.tar.gz
配置maven环境变量,maven软件包放置到/home/hadoop/work/apache-maven-3.3.9
[hadoop@bi10 hadoop-2.6.2]$ vim ~/.bash_profile #init maven environment export MAVEN_HOME=/home/hadoop/work/apache-maven-3.3.9 export PATH=$PATH:$MAVEN_HOME/bin
下载lzo安装包
[hadoop@bi10 apps]$ wget http://www.oberhumer.com/opensource/lzo/download/lzo-2.09.tar.gz
解压并编译安装lzo到:/usr/local/hadoop/lzo/,安装时切换到root用户下
[hadoop@bi10 apps]$ tar -xzf lzo-2.09.tar.gz [hadoop@bi10 apps]$ cd lzo-2.09 [hadoop@bi10 apps]$ su root [root@bi10 lzo-2.09]$ ./configure -enable-shared -prefix=/usr/local/hadoop/lzo/ [root@bi10 lzo-2.09]$ make && make test && make install
查看安装目录
[hadoop@bi10 lzo-2.09]$ ls /usr/local/hadoop/lzo/ include lib share
下载hadoop-lzo
git clone https://github.com/twitter/hadoop-lzo.git
设置环境变量,并使用maven编译
[hadoop@bi10 hadoop-lzo]$ export CFLAGS=-m64 [hadoop@bi10 hadoop-lzo]$ export CXXFLAGS=-m64 [hadoop@bi10 hadoop-lzo]$ export C_INCLUDE_PATH=/usr/local/hadoop/lzo/include [hadoop@bi10 hadoop-lzo]$ export LIBRARY_PATH=/usr/local/hadoop/lzo/lib [hadoop@bi10 hadoop-lzo]$ mvn clean package -Dmaven.test.skip=true
将编译好的文件拷贝到hadoop的安装目录
[hadoop@bi10 hadoop-lzo]$ tar -cBf - -C target/native/Linux-amd64-64/lib . | tar -xBvf - -C $HADOOP_HOME/lib/native/ [hadoop@bi10 hadoop-lzo]$ cp target/hadoop-lzo-0.4.20-SNAPSHOT.jar $HADOOP_HOME/share/hadoop/common/ [hadoop@bi10 hadoop-lzo]$ scp target/hadoop-lzo-0.4.20-SNAPSHOT.jar bi12:$HADOOP_HOME/share/hadoop/common/ [hadoop@bi10 hadoop-lzo]$ scp target/hadoop-lzo-0.4.20-SNAPSHOT.jar bi13:$HADOOP_HOME/share/hadoop/common/
将编译好的文件分别复制到集群其他机器对应的目录,其中native目录需要先打包再拷贝到集群的其他机器上,然后解压。
tar -czf hadoop-native.tar.gz /$HADOOP_HOME/lib/native/ scp hadoop-native.tar.gz bi12:/$HADOOP_HOME/lib scp hadoop-native.tar.gz bi13:/$HADOOP_HOME/lib
修改hadoop-env.sh,增加一条
# The lzo library export LD_LIBRARY_PATH=/usr/local/hadoop/lzo/lib
修改core-site.xml
<property> <name>io.compression.codecs</name> <value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec,org.apache.hadoop.io.compress.BZip2Codec</value> </property> <property> <name>io.compression.codec.lzo.class</name> <value>com.hadoop.compression.lzo.LzoCodec</value> </property>
修改mapred-site.xml
<!-- lzo压缩 --> <property> <name>mapred.compress.map.output</name> <value>true</value> </property> <property> <name>mapred.map.output.compression.codec</name> <value>com.hadoop.compression.lzo.LzoCodec</value> </property> <property> <name>mapred.child.env</name> <value>LD_LIBRARY_PATH=/usr/local/hadoop/lzo/lib</value> </property>
拷贝三个配置文件到集群其他机器
scp etc/hadoop/hadoop-env.sh bi12:/home/hadoop/work/hadoop-2.6.2/etc/hadoop/ scp etc/hadoop/hadoop-env.sh bi13:/home/hadoop/work/hadoop-2.6.2/etc/hadoop/ scp etc/hadoop/core-site.xml bi12:/home/hadoop/work/hadoop-2.6.2/etc/hadoop/ scp etc/hadoop/core-site.xml bi13:/home/hadoop/work/hadoop-2.6.2/etc/hadoop/ scp etc/hadoop/mapred-site.xml bi12:/home/hadoop/work/hadoop-2.6.2/etc/hadoop/ scp etc/hadoop/mapred-site.xml bi13:/home/hadoop/work/hadoop-2.6.2/etc/hadoop/
安装lzop,需要切换到root用户下
yum install lzop
进入hadoop安装目录然后对LICENSE.txt执行lzo压缩,会生成一个lzo压缩文件LICENSE.txt.lzo
lzop LICENSE.txt
上传压缩文件到hdfs
[hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -mkdir /user/hadoop/wordcount/lzoinput [hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -put LICENSE.txt.lzo /user/hadoop/wordcount/lzoinput [hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -ls /user/hadoop/wordcount/lzoinput Found 1 items -rw-r--r-- 2 hadoop supergroup 7773 2016-02-16 20:59 /user/hadoop/wordcount/lzoinput/LICENSE.txt.lzo
对lzo压缩文件建立索引
hadoop jar ./share/hadoop/common/hadoop-lzo-0.4.20-SNAPSHOT.jar com.hadoop.compression.lzo.DistributedLzoIndexer /user/hadoop/wordcount/lzoinput/ [hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -ls /user/hadoop/wordcount/lzoinput/ Found 2 items -rw-r--r-- 2 hadoop supergroup 7773 2016-02-16 20:59 /user/hadoop/wordcount/lzoinput/LICENSE.txt.lzo -rw-r--r-- 2 hadoop supergroup 8 2016-02-16 21:02 /user/hadoop/wordcount/lzoinput/LICENSE.txt.lzo.index
对lzo压缩文件执行wordcount
hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.2.jar wordcount /user/hadoop/wordcount/lzoinput/ /user/hadoop/wordcount/output2
“hadoop-2.6.2 lzo的配置过程”的内容就介绍到这里了,感谢大家的阅读。如果想了解更多行业相关的知识可以关注亿速云网站,小编将为大家输出更多高质量的实用文章!
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。