写Spark代码的时候经常发现rdd没有reduceByKey的方法,这个发生在spark1.2及其以前对版本,因为rdd本身不存在reduceByKey的方法,需要隐式转换成PairRDDFunctions才能访问,因此需要引入Import org.apache.spark.SparkContext._。
不过到了spark1.3的版本后,隐式转换的放在rdd的object中,这样就会自动被引入,不需要显式引入。
* Defines implicit functions that provide extra functionalities on RDDs of specific types. * For example, `RDD`.`rddToPairRDDFunctions` converts an RDD into a `PairRDDFunctions` for * key-value-pair RDDs, and enabling extra functionalities such as `PairRDDFunctions`.`reduceByKey`. */ object RDD { // The following implicit functions were in SparkContext before 1.3 and users had to // `import SparkContext._` to enable them. Now we move them here to make the compiler find // them automatically. However, we still keep the old functions in SparkContext for backward // compatibility and forward to the following functions directly. implicit def rddToPairRDDFunctions[K, V](rdd: RDD[(K, V)]) (implicit kt: ClassTag[K], vt: ClassTag[V], ord: Ordering[K] = null): PairRDDFunctions[K, V] = { new PairRDDFunctions(rdd) }
至于什么是隐式转换,简单来讲就是scala偷梁换柱换柱,让隔壁老王来干你干不了的事情了。
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。