温馨提示×

温馨提示×

您好,登录后才能下订单哦!

密码登录×
登录注册×
其他方式登录
点击 登录注册 即表示同意《亿速云用户服务条款》

python怎么实现随机森林

发布时间:2022-01-12 17:30:14 阅读:160 作者:iii 栏目:大数据
Python开发者专用服务器限时活动,0元免费领,库存有限,领完即止! 点击查看>>

这篇文章主要介绍了python怎么实现随机森林的相关知识,内容详细易懂,操作简单快捷,具有一定借鉴价值,相信大家阅读完这篇python怎么实现随机森林文章都会有所收获,下面我们一起来看看吧。

背景介绍

随机森林是一组决策树的商标术语。在随机森林中,我们收集了决策树(也称为“森林”)。为了基于属性对新对象进行分类,每棵树都有一个分类,我们称该树对该类“投票”。森林选择投票最多的类别(在森林中的所有树木上)。

每棵树的种植和生长如下:

  1. 如果训练集中的案例数为N,则随机抽取N个案例样本,但要进行替换。 该样本将成为树木生长的训练集。

  2. 如果有M个输入变量,则指定数字m << M,以便在每个节点上从M个中随机选择m个变量,并使用对这m个变量的最佳分割来分割节点。在森林生长期间,m的值保持恒定。

  3. 每棵树都尽可能地生长。没有修剪。

python怎么实现随机森林

 
入门示例

python代码实现:

'''The following code is for the Random ForestCreated by - ANALYTICS VIDHYA'''# importing required librariesimport pandas as pdfrom sklearn.ensemble import RandomForestClassifierfrom sklearn.metrics import accuracy_score# read the train and test datasettrain_data = pd.read_csv('train-data.csv')test_data = pd.read_csv('test-data.csv')# view the top 3 rows of the datasetprint(train_data.head(3))# shape of the datasetprint('\nShape of training data :',train_data.shape)print('\nShape of testing data :',test_data.shape)# Now, we need to predict the missing#  target variable in the test data# target variable - Survived# seperate the independent and target variable on training datatrain_x = train_data.drop(columns=['Survived'],axis=1)train_y = train_data['Survived']# seperate the independent and target variable on testing datatest_x = test_data.drop(columns=['Survived'],axis=1)test_y = test_data['Survived']'''Create the object of the Random Forest modelYou can also add other parameters and test your code hereSome parameters are : n_estimators and max_depthDocumentation of sklearn RandomForestClassifier: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html'''model = RandomForestClassifier()# fit the model with the training datamodel.fit(train_x,train_y)# number of trees usedprint('Number of Trees used : ', model.n_estimators)# predict the target on the train datasetpredict_train = model.predict(train_x)print('\nTarget on train data',predict_train) # Accuray Score on train datasetaccuracy_train = accuracy_score(train_y,predict_train)print('\naccuracy_score on train dataset : ', accuracy_train)# predict the target on the test datasetpredict_test = model.predict(test_x)print('\nTarget on test data',predict_test) # Accuracy Score on test datasetaccuracy_test = accuracy_score(test_y,predict_test)print('\naccuracy_score on test dataset : ', accuracy_test)
 

运行结果:

Shape of training data : (712, 25)Shape of testing data : (179, 25)Number of Trees used :  10Target on train data [0 1 1 0 0 0 0 0 0 0 0 1 1 1 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 1 1 0 0 1 0 0 0 1 0 0 0 1 0 1 0 1 1 0 1 0 1 0 0 0 0 0 0 0 1 0 1 1 1 0 0 1 0 01 0 0 0 0 0 0 1 1 0 0 1 0 0 1 1 1 0 0 0 1 0 1 0 0 1 0 0 0 1 1 0 0 1 0 1 11 0 1 0 0 0 0 0 0 1 1 0 0 1 0 1 0 1 1 0 0 0 1 0 0 1 0 0 0 1 0 1 0 1 0 0 00 1 0 1 1 0 0 0 0 1 1 0 0 1 0 0 0 0 1 0 0 1 1 0 1 1 0 0 0 0 0 0 0 1 0 0 00 0 1 0 0 1 0 1 1 1 1 0 0 1 0 1 0 0 1 1 1 1 0 0 0 1 1 1 1 0 0 0 0 1 1 0 00 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 1 1 01 0 0 0 1 0 0 1 1 0 1 1 1 0 1 1 0 0 1 0 0 1 1 1 1 0 0 1 0 0 0 1 1 0 0 1 10 0 0 0 0 0 0 0 1 1 0 1 1 0 1 0 1 1 1 0 0 0 1 0 1 1 0 0 0 0 1 0 1 0 0 0 00 0 0 0 0 1 1 0 0 1 1 0 1 0 0 0 1 0 1 0 0 0 1 0 1 0 0 1 0 1 0 0 0 1 0 1 00 0 0 0 1 0 0 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 1 0 0 0 0 1 0 0 1 1 1 1 01 1 0 1 1 1 0 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 0 0 1 0 1 1 0 0 1 0 1 0 1 1 11 0 0 1 0 0 0 1 0 0 0 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1 0 0 0 0 0 0 0 1 1 10 0 0 0 0 0 0 0 1 1 1 0 0 1 0 1 1 0 1 0 0 0 1 1 1 0 1 0 0 0 0 0 0 0 0 0 00 0 0 1 0 1 1 0 0 0 0 1 0 0 0 1 0 1 0 1 1 1 0 0 0 0 0 0 1 1 1 0 0 1 1 1 01 0 1 0 0 1 0 0 0 1 1 0 0 1 0 0 1 0 1 0 0 1 0 0 1 1 0 0 1 1 0 1 0 0 0 0 11 0 1 1 1 0 1 0 1 0 1 1 0 1 0 1 0 0 1 0 0 1 0 1 1 0 1 0 0 0 1 0 1 0 0 0 00 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 1 1 0 0 1 0 1 0 0 1 0 0 1 1 1 1 0 1 0 0 01 0 1 1 1 0 1 0 0 0 1 0 0 1 0 0 1 0 1 0 0 1 1 0 0 1 0 0 0 0 0 0 1 0 0 0 00 0 1 0 1 0 1 0 1 1 1 0 0 1 0]accuracy_score on train dataset :  0.973314606741573Target on test data [0 0 0 1 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 01 1 1 1 0 0 1 0 1 1 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1 1 0 1 1 0 0 1 1 1 0 0 0 00 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 1 0 1 0 0 1 0 1 0 0 0 1 0 0 0 0 0 11 0 1 1 0 1 0 1 0 0 0 1 1 1 1 1 0 1 1 0 1 1 0 0 1 1 0 0 1 1 0 0 0 1 0 1 00 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 0 0 1 1 0 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0]accuracy_score on test dataset :  0.8156424581005587

关于“python怎么实现随机森林”这篇文章的内容就介绍到这里,感谢各位的阅读!相信大家对“python怎么实现随机森林”知识都有一定的了解,大家如果还想学习更多知识,欢迎关注亿速云行业资讯频道。

亿速云「云服务器」,即开即用、新一代英特尔至强铂金CPU、三副本存储NVMe SSD云盘,价格低至29元/月。点击查看>>

向AI问一下细节

免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。

原文链接:https://my.oschina.net/u/4592685/blog/4416257

AI

开发者交流群×