admin管理员组文章数量:1417020
Scikit
Scikit-plot画图
在机器学习过程中画图是一个重要的步骤,例如在分类任务中需要画P-R曲线,AUC曲线,混淆曲线等,使用matpotlib, Seaborn等类库作图需要多写几行代码,例如设置title
,xlim
,ylim
,lengend
等,如果有一个工具库可以封装这些操作的话可以帮助我们节省时间,提升开发效率,从而专注在算法/业务的改进上。
安装
conda install -c conda-forge scikit-plot
# 或者
pip install scikit-plot
导包
from sklearn.datasets import load_digits, load_breast_cancer, load_iris, make_classification
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.cluster import KMeans
from sklearn.svm import LinearSVC
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt
import scikitplot as skplt
import warningswarnings.filterwarnings('ignore')
混淆矩阵
X, y = load_digits(return_X_y=True)
rf = RandomForestClassifier()
rf.fit(X, y)
preds = rf.predict(X)
skplt.metrics.plot_confusion_matrix(y_true=y, y_pred=preds)
plt.show()
ROC曲线
X, y = load_digits(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)
nb = GaussianNB()
nb.fit(X_train, y_train)
predicted_probas = nb.predict_proba(X_test)skplt.metrics.plot_roc(y_test, predicted_probas)
plt.show()
KS统计
X, y = load_breast_cancer(return_X_y=True)
lr = LogisticRegression()
lr.fit(X, y)
probas = lr.predict_proba(X)
skplt.metrics.plot_ks_statistic(y_true=y, y_probas=probas)
plt.show()
Precision-Recall
X, y = load_digits(return_X_y=True)
nb = GaussianNB()
nb.fit(X, y)
probas = nb.predict_proba(X)
skplt.metrics.plot_precision_recall(y_true=y, y_probas=probas)
plt.show()
聚类
X, y = load_iris(return_X_y=True)
kmeans = KMeans(n_clusters=4, random_state=1)
cluster_labels = kmeans.fit_predict(X)
skplt.metrics.plot_silhouette(X, cluster_labels)
plt.show()
Calibration Curve(校准曲线)
X, y = make_classification(n_samples=100000, n_features=20,n_informative=2, n_redundant=2,random_state=20)X_train, y_train, X_test, y_test = X[:1000], y[:1000], X[1000:], y[1000:]rf_probas = RandomForestClassifier().fit(X_train, y_train).predict_proba(X_test)
lr_probas = LogisticRegression().fit(X_train, y_train).predict_proba(X_test)
nb_probas = GaussianNB().fit(X_train, y_train).predict_proba(X_test)
sv_scores = LinearSVC().fit(X_train, y_train).decision_function(X_test)probas_list = [rf_probas, lr_probas, nb_probas, sv_scores]
clf_names=['Random Forest','Logistic Regression','Gaussian Naive Bayes','Support Vector Machine']skplt.metrics.plot_calibration_curve(y_test,probas_list=probas_list,clf_names=clf_names,n_bins=10)
plt.show()
Cumulative Gain(累计增益)
根据标签和分数/概率生成累积增益图
X, y = load_breast_cancer(return_X_y=True)
lr = LogisticRegression()
lr.fit(X, y)
probas = lr.predict_proba(X)
skplt.metrics.plot_cumulative_gain(y_true=y, y_probas=probas)
plt.show()
Lift Curve(提升曲线)
从标签和分数/概率生成提升曲线
X, y = load_breast_cancer(return_X_y=True)
lr = LogisticRegression()
lr.fit(X, y)
probas = lr.predict_proba(X)
skplt.metrics.plot_lift_curve(y_true=y, y_probas=probas)
plt.show()
Learning Curve(学习曲线)
X, y = load_breast_cancer(return_X_y=True)
rf = RandomForestClassifier()
skplt.estimators.plot_learning_curve(rf, X, y)
plt.show()
Feature Importances(特征重要性)
X, y = load_iris(return_X_y=True)
rf = RandomForestClassifier()
rf.fit(X, y)
skplt.estimators.plot_feature_importances(rf,feature_names=['petal length','petal width','sepal length','sepal width'])
plt.show()
Elbow Curve
为 KMeans 聚类绘制不同 K 值的肘部曲线。
X, y = load_iris(return_X_y=True)
kmeans = KMeans(random_state=1)
skplt.cluster.plot_elbow_curve(kmeans, X, cluster_ranges=range(1, 11))
plt.show()
PCA Component Variance
绘制 PCA 组件的解释方差比。
X, y = load_digits(return_X_y=True)
pca = PCA(random_state=1)
pca.fit(X)
skplt.decomposition.plot_pca_component_variance(pca)
plt.show()
PCA 2d Projection
在给定数据集上绘制 PCA 的二维投影。
X, y = load_digits(return_X_y=True)
pca = PCA(random_state=1)
pca.fit(X)
skplt.decomposition.plot_pca_2d_projection(pca, X, y)
plt.show()
本文标签: Scikit
版权声明:本文标题:Scikit 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1687064653a60053.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论