这篇文章主要介绍“测试kmean时报错怎么解决”,在日常操作中,相信很多人在测试kmean时报错怎么解决问题上存在疑惑,小编查阅了各式资料,整理出简单好用的操作方法,希望对大家解答”测试kmean时报错怎么解决”的疑惑有所帮助!接下来,请跟着小编一起来学习吧!
测试mahout in action 中kmean实例的时候,输入命令:
bin/mahout kmeans -i reuters-vectors/tfidf-vectors/ \
-c reuters-initial-clusters \
-o reuters-kmeans-clusters \
-dm org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure \
-cd 1.0 -k 20 -x 20 -cl
报出先面的错误
mahout kmeans -i tfidf-vectors/ -c reuters-initial-clusters o reuters-kmeans-clusters dm org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure cd 1.0 -k 20 -x 20 -cl
Running on hadoop, using HADOOP_HOME=/usr/local/hadoop
HADOOP_CONF_DIR=/usr/local/hadoop/conf
14/01/23 12:43:34 ERROR common.AbstractJob: Unexpected o while processing Job-Specific Options:
usage: <command> [Generic Options] [Job-Specific Options]
Generic Options:
-archives <paths> comma separated archives to be unarchived
on the compute machines.
-conf <configuration file> specify an application configuration file
-D <property=value> use value for given property
-files <paths> comma separated files to be copied to the
map reduce cluster
-fs <local|namenode:port> specify a namenode
-jt <local|jobtracker:port> specify a job tracker
-libjars <paths> comma separated jar files to include in the
classpath.
Job-Specific Options:
--input (-i) input Path to job input directory.
--output (-o) output The directory pathname for
output.
--distanceMeasure (-dm) distanceMeasure The classname of the
DistanceMeasure. Default is
SquaredEuclidean
--clusters (-c) clusters The input centroids, as Vectors.
Must be a SequenceFile of
Writable, Cluster/Canopy. If k
is also specified, then a random
set of vectors will be selected
and written out to this path
first
--numClusters (-k) k The k in k-Means. If specified,
then a random selection of k
Vectors will be chosen as the
Centroid and written to the
clusters input path.
--convergenceDelta (-cd) convergenceDelta The convergence delta value.
Default is 0.5
--maxIter (-x) maxIter The maximum number of
iterations.
--overwrite (-ow) If present, overwrite the output
directory before running job
--clustering (-cl) If present, run clustering after
the iterations have taken place
--method (-xm) method The execution method to use:
sequential or mapreduce. Default
is mapreduce
--help (-h) Print out help
--tempDir tempDir Intermediate output directory
--startPhase startPhase First phase to run
--endPhase endPhase Last phase to run
14/01/23 12:43:35 INFO driver.MahoutDriver: Program took 149 ms
解决办法:输入命令参数o 前面没有加- ,mahout不能解析后面的参数,所以报出来的错误.
到此,关于“测试kmean时报错怎么解决”的学习就结束了,希望能够解决大家的疑惑。理论与实践的搭配能更好的帮助大家学习,快去试试吧!若想继续学习更多相关知识,请继续关注亿速云网站,小编会继续努力为大家带来更多实用的文章!
亿速云「云服务器」,即开即用、新一代英特尔至强铂金CPU、三副本存储NVMe SSD云盘,价格低至29元/月。点击查看>>
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。
原文链接:https://my.oschina.net/winHerson/blog/195258