这篇文章主要介绍了python怎么实现决策树的相关知识,内容详细易懂,操作简单快捷,具有一定借鉴价值,相信大家阅读完这篇python怎么实现决策树文章都会有所收获,下面我们一起来看看吧。
背景介绍
这是我最喜欢的算法之一,我经常使用它。它是一种监督学习算法,主要用于分类问题。令人惊讶的是,它适用于分类和连续因变量。在该算法中,我们将总体分成两个或更多个同类集。这是基于最重要的属性/独立变量来完成的,以尽可能地作为不同的组。
在上图中,您可以看到人口根据多个属性分为四个不同的组,以识别“他们是否会玩”。为了将人口分成不同的异构群体,它使用各种技术,如基尼,信息增益,卡方,熵。
理解决策树如何工作的最好方法是玩Jezzball--一款来自微软的经典游戏(如下图所示)。基本上,你有一个移动墙壁的房间,你需要创建墙壁,以便最大限度的区域被球清除。
所以,每次你用墙隔开房间时,你都试图在同一个房间里创造2个不同的人口。决策树以非常类似的方式工作,通过将人口分成尽可能不同的群体。
接下来看使用Python Scikit-learn的决策树案例:
import pandas as pdfrom sklearn.tree import DecisionTreeClassifierfrom sklearn.metrics import accuracy_score# read the train and test datasettrain_data = pd.read_csv('train-data.csv')test_data = pd.read_csv('test-data.csv')# shape of the datasetprint('Shape of training data :',train_data.shape)print('Shape of testing data :',test_data.shape)train_x = train_data.drop(columns=['Survived'],axis=1)train_y = train_data['Survived']test_x = test_data.drop(columns=['Survived'],axis=1)test_y = test_data['Survived']model = DecisionTreeClassifier()model.fit(train_x,train_y)# depth of the decision treeprint('Depth of the Decision Tree :', model.get_depth())# predict the target on the train datasetpredict_train = model.predict(train_x)print('Target on train data',predict_train) # Accuray Score on train datasetaccuracy_train = accuracy_score(train_y,predict_train)print('accuracy_score on train dataset : ', accuracy_train)# predict the target on the test datasetpredict_test = model.predict(test_x)print('Target on test data',predict_test) # Accuracy Score on test datasetaccuracy_test = accuracy_score(test_y,predict_test)print('accuracy_score on test dataset : ', accuracy_test)
上面代码运行结果:
Shape of training data : (712, 25)Shape of testing data : (179, 25)Depth of the Decision Tree : 19Target on train data [0 1 1 0 0 0 0 0 0 0 0 1 1 1 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 1 1 0 0 1 0 0 0 1 0 0 0 1 0 1 0 1 0 0 1 0 1 0 0 0 0 0 0 0 1 0 1 1 1 1 0 1 0 01 0 0 0 0 0 0 1 1 0 0 1 0 0 1 1 1 0 0 0 1 0 1 0 0 1 0 0 0 1 1 0 0 1 0 1 11 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 0 1 1 0 0 0 1 0 0 1 0 0 0 1 0 1 0 1 0 0 00 1 0 1 1 0 0 0 0 1 1 0 0 1 0 0 1 0 1 1 0 1 1 0 1 1 0 0 0 0 0 0 0 1 0 0 00 0 1 0 0 1 0 1 1 1 1 0 0 1 0 1 0 0 1 1 1 1 0 0 0 1 1 1 1 0 0 0 0 1 0 0 00 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 1 1 01 0 0 0 1 0 0 1 1 0 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 0 0 1 0 0 0 1 1 0 0 1 10 0 0 0 0 0 0 0 1 1 0 1 1 0 1 0 1 1 1 0 0 0 1 0 1 1 0 0 0 0 1 0 1 1 0 0 00 0 0 0 0 1 1 0 0 1 1 0 1 0 0 0 1 0 1 0 0 0 1 0 1 0 0 1 0 1 0 0 0 0 0 1 00 0 1 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 1 0 0 0 0 1 0 0 1 1 1 1 01 1 0 1 1 1 0 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 0 0 1 0 0 1 0 0 1 0 1 0 1 1 11 0 0 1 0 0 0 1 0 0 0 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1 0 0 0 0 0 0 0 1 1 10 0 0 0 0 0 0 0 1 1 1 0 0 1 0 1 1 0 1 0 0 0 1 1 1 0 1 0 0 0 0 0 0 0 0 0 00 0 0 1 0 1 1 0 0 0 0 1 0 0 0 1 0 1 0 1 1 1 0 0 0 0 0 0 1 1 1 0 0 1 1 1 01 0 1 0 0 1 0 0 0 1 1 0 0 1 0 0 1 0 1 0 0 1 0 0 0 1 0 0 1 1 0 1 0 0 0 0 11 0 1 1 1 0 1 0 1 0 1 1 0 1 0 1 0 0 1 0 0 1 0 1 1 0 1 0 0 0 1 0 1 0 0 0 00 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 1 1 0 0 1 0 1 0 0 1 0 0 1 1 1 1 0 1 0 0 01 0 1 0 1 0 1 0 0 0 1 0 0 1 0 0 1 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 1 0 1 0 1 0 1 1 1 0 0 1 0]accuracy_score on train dataset : 0.9859550561797753Target on test data [0 0 0 1 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 01 1 1 1 0 0 1 0 1 1 0 1 1 1 1 0 1 0 0 0 1 0 0 0 1 1 0 1 1 1 0 0 1 1 1 0 1 1 1 0 1 1 1 0 0 0 00 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 1 1 0 0 1 0 1 0 1 1 11 0 1 1 0 1 0 1 0 0 0 0 1 1 1 1 0 1 1 1 1 1 0 0 1 1 0 0 1 1 0 0 0 1 0 1 01 0 0 0 1 0 0 0 0 1 1 0 0 1 0 0 0 0 0 1 0 0 0 1 1 0 1 0 1 1 0 1 0 0 0 0 0]accuracy_score on test dataset : 0.770949720670391
关于“python怎么实现决策树”这篇文章的内容就介绍到这里,感谢各位的阅读!相信大家对“python怎么实现决策树”知识都有一定的了解,大家如果还想学习更多知识,欢迎关注亿速云行业资讯频道。
亿速云「云服务器」,即开即用、新一代英特尔至强铂金CPU、三副本存储NVMe SSD云盘,价格低至29元/月。点击查看>>
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。
原文链接:https://my.oschina.net/u/4592685/blog/4439970