本篇内容介绍了“Hadoop中MapReduce获取命令行参数的方法”的有关知识,在实际案例的操作过程中,不少人都会遇到这样的困境,接下来就让小编带领大家学习一下如何处理这些情况吧!希望大家仔细阅读,能够学有所成!
package cmd;
import java.io.IOException;
import java.net.URI;
import java.net.URISyntaxException;
import mapreduce.MyMapper;
import mapreduce.MyReducer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.mapreduce.lib.partition.HashPartitioner;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
/**
* 计算单词
* @author Xr
*
*/
public class WordCountApp extends Configured implements Tool{
public static String INPUT_PATH = "";
public static String OUTPUT_PATH = "";
@Override
public int run(String[] args) throws Exception {
INPUT_PATH = args[0];
OUTPUT_PATH = args[1];
Configuration conf = new Configuration();
//判处是否存在输入目录
existsFile(conf);
Job job = new Job(conf,WordCountApp.class.getName());
//打成jar包
job.setJarByClass(WordCountApp.class);
//1.1 从哪里读取数据
FileInputFormat.setInputPaths(job, INPUT_PATH);
//把输入文本中的每一行解析成一个个键值对
job.setInputFormatClass(TextInputFormat.class);
//1.2 设置自定义map函数
job.setMapperClass(MyMapper.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(LongWritable.class);
//1.3 分区
job.setPartitionerClass(HashPartitioner.class);
job.setNumReduceTasks(1);
//1.4 TODO 排序分组
//1.5 TODO 规约
//2.1 是框架做的,不需要程序员手工干预。
//2.2 自定义reducer函数
job.setReducerClass(MyReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(LongWritable.class);
//2.3 写入到HDFS中
FileOutputFormat.setOutputPath(job, new Path(OUTPUT_PATH));
//格式化类
job.setOutputFormatClass(TextOutputFormat.class);
//提交给JobTracker去执行
job.waitForCompletion(true);
return 0;
}
public static void main(String[] args)throws Exception {
ToolRunner.run(new WordCountApp(), args);
}
private static void existsFile(Configuration conf) throws IOException,
URISyntaxException {
FileSystem fs = FileSystem.get(new URI(INPUT_PATH), conf);
if(fs.exists(new Path(OUTPUT_PATH))){
fs.delete(new Path(OUTPUT_PATH), true);
}
}
}
运行:hadoop jar WordCount.jar hdfs://hadoop:9000/hello hdfs://hadoop:9000/h2
Name : Xr
Date : 2014-03-02 21:47
“Hadoop中MapReduce获取命令行参数的方法”的内容就介绍到这里了,感谢大家的阅读。如果想了解更多行业相关的知识可以关注亿速云网站,小编将为大家输出更多高质量的实用文章!
亿速云「云服务器」,即开即用、新一代英特尔至强铂金CPU、三副本存储NVMe SSD云盘,价格低至29元/月。点击查看>>
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。
原文链接:https://my.oschina.net/Xiao629/blog/204439