admin管理员组文章数量:1288055
I have come around a strange thing when plotting a decision tree in sklearn.
I just wanted to compare a Random Forest model consisting of one estimator using bootstrapping and one without bootstrapping.
I am using a toy dataset with sample size n = 5.
My code looks the following:
tree_model = RandomForestRegressor(
n_estimators=1,
max_features=2,
max_depth=None,
min_samples_split=2,
min_samples_leaf=1,
random_state=42,
bootstrap=True # (or False)
)
tree_model.fit(X_train, y_train)
single_tree = tree_model.estimators_[0]
fig = plt.figure(figsize=(14, 14), facecolor="white", dpi=300)
ax = fig.add_subplot(111)
plot_tree(single_tree, feature_names=["X1", "X2"], filled=True, impurity=False, rounded=False, ax=ax)
Now when plotting the tree without Bootstrapping this looks good, the root node starts with 5 samples, but when plotting the tree when Bootstrapping is used, the root node starts with 4 samples. How is that even possible? I mean regardless of using bootstrapping or not, the training set will consist of 5 samples, so the root node should start with 5 samples.
本文标签: scikit learnPlotting one Decision Tree of a Random Forest in sklearnStack Overflow
版权声明:本文标题:scikit learn - Plotting one Decision Tree of a Random Forest in sklearn - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1741333572a2372908.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论