这篇“R语言的scale()函数怎么用”文章的知识点大部分人都不太理解,所以小编给大家总结了以下内容,内容详细,步骤清晰,具有一定的借鉴价值,希望大家阅读完这篇文章能有所收获,下面我们一起来看看这篇“R语言的scale()函数怎么用”文章吧。
R语言中scale函数,可以对数据进行处理,标准化(归一化)在一定的范围,比较适合大范围变化数据归一化处理从而观察数据变化趋势 scale()函数 scale(x, center = TRUE, scale = TRUE) x一般...
R语言中scale函数,可以对数据进行处理,标准化(归一化)在一定的范围,比较适合大范围变化数据归一化处理从而观察数据变化趋势
scale()函数
scale(x, center = TRUE, scale = TRUE)
x一般是一个矩阵,也可以是一个数值向量
center--是否中心化
scale--是否标准化
1、以数值向量为例:
> A [1] 3.74149 7.36180 5.81734 5.71131 7.97054 10.37620 6.29949 5.55062 5.84779 [10] 15.58810 14.76360 17.74670
长度12,scale=T,center=T:
> scaleA=scale(A) > scaleA [,1] [1,] -1.1123845 [2,] -0.3313828 [3,] -0.6645658 [4,] -0.6874395 [5,] -0.2000606 [6,] 0.3189073 [7,] -0.5605527 [8,] -0.7221048 [9,] -0.6579969 [10,] 1.4432593 [11,] 1.2653916 [12,] 1.9089294 attr(,"scaled:center") [1] 8.897915 attr(,"scaled:scale") [1] 4.63547 >
长度12,scale=T,center=F,数字都是正数
> scaleA=scale(A,center=F) > scaleA [,1] [1,] 0.3602619 [2,] 0.7088557 [3,] 0.5601421 [4,] 0.5499327 [5,] 0.7674702 [6,] 0.9991073 [7,] 0.6065676 [8,] 0.5344601 [9,] 0.5630741 [10,] 1.5009526 [11,] 1.4215629 [12,] 1.7088007 attr(,"scaled:scale") [1] 10.38547
注意:数值不能完全一致,否则返回NaN:
> scale(c(1,1,1,1,1,1)) [,1] [1,] NaN [2,] NaN [3,] NaN [4,] NaN [5,] NaN [6,] NaN attr(,"scaled:center") [1] 1 attr(,"scaled:scale") [1] 0
2、以数值矩阵为例:计算的是每列的scale结果(第一列数据刚好是上一步演示数据,可对比一下结果)
> dat1 A B C D E F G H I J K L CK-WT-1 3.74149 5.23528 2.821317 118.6600 1.8737693 1.7103460 30.26110 86.6405 1448.6278 173.9960 77.06166 3.19210 CK-WT-2 7.36180 2.77070 1.563395 140.1430 16.9090246 0.7802436 33.65711 116.4700 1634.0417 51.0019 98.30970 4.69276 CK-WT-3 5.81734 2.66859 1.931628 123.3830 0.9559375 2.7996091 31.46691 111.7380 1566.5626 52.3322 101.42702 3.58136 CK-tdr1-1 5.71131 3.22632 3.194809 97.2229 0.4774184 4.7297117 30.96890 82.8809 648.4734 66.9486 46.86340 3.03234 CK-tdr1-2 7.97054 1.32105 2.600854 95.2539 0.5273923 4.3637146 28.03340 85.7292 683.4113 41.1148 70.29293 2.11160 CK-tdr1-3 10.37620 1.96726 2.301278 91.8525 0.4333881 3.3732144 27.62150 79.6027 647.2750 49.7169 57.09809 3.53808 NaWT-1 6.29949 2.40259 2.044360 121.8080 39.1065780 2.2783575 35.59571 106.4650 1248.4062 192.7300 151.37454 4.79151 NaWT-2 5.55062 3.23077 2.104095 125.1350 36.5302500 2.8043996 32.99440 111.3370 1117.6042 183.2700 160.54078 4.16132 NaWT-3 5.84779 4.80378 2.630611 106.5070 19.4561309 2.9542534 32.77111 98.1677 1191.6926 111.2120 137.35694 3.40994 Natdr1-1 15.58810 2.04301 2.289544 81.6997 13.2227038 3.1700429 19.02370 69.4519 501.2779 78.8024 101.08433 6.01932 Natdr1-2 14.76360 2.29524 2.801336 84.8495 10.8897780 4.6643058 18.14860 69.7807 395.9033 96.2520 82.21420 5.59169 Natdr1-3 17.74670 1.95286 2.450605 80.3895 12.2580100 4.0243357 15.79980 68.8929 468.8953 66.7984 108.79391 8.12127
每列长度12,scale=T,center=T,返回scale后的矩阵
> scaleDat1 A B C D E F G H I J K CK-WT-1 -1.1123845 2.06922600 0.9498394 0.65959663 -0.79734415 -1.19085395 0.3345230824 -0.2241247 1.0711933 1.37750741 -0.62155046 CK-WT-2 -0.3313828 -0.04789386 -1.8494507 1.74255232 0.30794653 -1.96684043 0.8433687097 1.4659006 1.4799090 -0.82335259 -0.02949209 CK-WT-3 -0.6645658 -0.13560824 -1.0300104 0.89768254 -0.86481696 -0.28207930 0.5151965742 1.1978036 1.3311618 -0.79954817 0.05736930 CK-tdr1-1 -0.6874395 0.34349216 1.7809813 -0.42104526 -0.89999446 1.32820957 0.4405772141 -0.4371293 -0.6926209 -0.53800188 -1.46299915 CK-tdr1-2 -0.2000606 -1.29317006 0.4592366 -0.52030233 -0.89632071 1.02285734 0.0007314704 -0.2757555 -0.6156058 -1.00027265 -0.81015531 CK-tdr1-3 0.3189073 -0.73806370 -0.2074192 -0.69176654 -0.90323127 0.19648085 -0.0609863755 -0.6228596 -0.6952627 -0.84634642 -1.17781819 NaWT-1 -0.5605527 -0.36410717 -0.7791451 0.81828696 1.93976112 -0.71696069 1.1338423951 0.8990556 0.6298363 1.71273416 1.44911408 NaWT-2 -0.7221048 0.34731479 -0.6462152 0.98600067 1.75036684 -0.27808262 0.7440715578 1.1750845 0.3415040 1.54345664 1.70452348 NaWT-3 -0.6579969 1.69855951 0.5254554 0.04696522 0.49519263 -0.15305922 0.7106135334 0.4289623 0.5048200 0.25404869 1.05852570 Natdr1-1 1.4432593 -0.67299305 -0.2335303 -1.20356808 0.03695307 0.02697441 -1.3492524881 -1.1979651 -1.0170902 -0.32588964 0.04782059 Natdr1-2 1.2653916 -0.45632281 0.9053748 -1.04478701 -0.13454792 1.27364125 -1.4803744502 -1.1793365 -1.2493721 -0.01364599 -0.47797937 Natdr1-3 1.9089294 -0.75043357 0.1248833 -1.26961512 -0.03396473 0.73971279 -1.8323112232 -1.2296359 -1.0884727 -0.54068956 0.26264141 L CK-WT-1 -0.7138772 CK-WT-2 0.2084474 CK-WT-3 -0.4746331 CK-tdr1-1 -0.8120677 CK-tdr1-2 -1.3779661 CK-tdr1-3 -0.5012335 NaWT-1 0.2691404 NaWT-2 -0.1181823 NaWT-3 -0.5799900 Natdr1-1 1.0237679 Natdr1-2 0.7609411 Natdr1-3 2.3156530 attr(,"scaled:center") A B C D E F G H I J K L 8.897915 2.826454 2.394486 105.575333 12.720032 3.137711 28.028521 90.596375 962.680951 97.014600 99.368125 4.353607 attr(,"scaled:scale") A B C D E F G H I J K 4.6354700 1.1641193 0.4493719 19.8373766 13.6029875 1.1986064 6.6739314 17.6503265 453.6500351 55.8845631 35.8884205 L 1.6270411 >
3、矩阵巨大,或者指定行或者列进行标准化,可以用apply批量进行,譬如:12X2000的矩阵(结构类似上一步):
> dim(dat2) [1] 12 2000
默认可以直接scale(dat2)获得列结果,返回结果同第二步,如果指定行结果呢?
CK-WT-1 CK-WT-2 CK-WT-3 CK-tdr1-1 CK-tdr1-2 CK-tdr1-3 NaWT-1 NaWT-2 NaWT-3 Natdr1-1 Natdr1-2 Natdr1-3 AT1G01010 -0.2386968 -0.2245197 -0.2270909 -0.2677180 -0.2599348 -0.2392684 -0.2021240 -0.2093897 -0.2163308 -0.1800695 -0.1880828 -0.1720898 AT1G01030 -0.2322447 -0.2436961 -0.2411622 -0.2852821 -0.3091559 -0.2995012 -0.2159012 -0.2178163 -0.2205802 -0.2509797 -0.2544241 -0.2586433 AT1G01010 AT1G01030 AT1G01040 AT1G01050 AT1G01060 AT1G01070 AT1G01080 AT1G01090 AT1G01100 AT1G01120 -0.2386968 -0.2322447 -0.2426714 0.2576744 -0.2467641 -0.2474700 -0.1241498 0.1193716 6.0022478 0.4966890
注意指定行,也就是apply(dat2,1,scale),中1,返回结果将是一个大矩阵,对行进行scale,相当于多次对2000个数据做处理,返回矩阵结构和原矩阵发生了行列转置的情况,也就是说:对行处理的结果,出现在返回值的列中。
譬如经如上处理,第一行的返回值是ScaleDat2_row的第一列,如上展示了其中的前十个,对dat2[1,]处理的返回值前10则如下(二者一致):
注意事项,直接取第一列scale(dat2[1,]将返回NaN,需要先将其转换成纯数值向量as.numeric
> Row1=dat2[1,] > scale(as.numeric(Row1))[1:10] [1] -0.2386968 -0.2322447 -0.2426714 0.2576744 -0.2467641 -0.2474700 -0.1241498 0.1193716 6.0022478 0.4966890
以上就是关于“R语言的scale()函数怎么用”这篇文章的内容,相信大家都有了一定的了解,希望小编分享的内容对大家有帮助,若想了解更多相关的知识内容,请关注亿速云行业资讯频道。
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。