admin管理员组文章数量:1334342
I know there are several out of the box methods of saving the model.
However I want to save the model as a database blob.
I've seen examples where people extract 'coefs' etc from the dict of the estimator but when I try this with RandomForrestRegressor (for example) it says there are no coefs.
I've seen yet other examples that claim all estimators have a 'save' method somewhere but I can't get that to work.
Is there any way of getting the model data from the fitted estimator in a way that will work universally for all estimators?
I would then base64 the model data and persist it to my Database for later use. There seems little documentation on this aside from just 'use pickle or joblib' etc.
I know there are several out of the box methods of saving the model.
However I want to save the model as a database blob.
I've seen examples where people extract 'coefs' etc from the dict of the estimator but when I try this with RandomForrestRegressor (for example) it says there are no coefs.
I've seen yet other examples that claim all estimators have a 'save' method somewhere but I can't get that to work.
Is there any way of getting the model data from the fitted estimator in a way that will work universally for all estimators?
I would then base64 the model data and persist it to my Database for later use. There seems little documentation on this aside from just 'use pickle or joblib' etc.
Share Improve this question edited Nov 20, 2024 at 22:55 desertnaut 60.4k32 gold badges155 silver badges181 bronze badges asked Nov 20, 2024 at 18:50 RichardRichard 1,13010 silver badges23 bronze badges1 Answer
Reset to default 0This is a good summary of some methods of serializing sklearn models.
pickle
works on most python objects and I see that it works fine for sklearn models too. For example
def model_to_base64(model):
import pickle, base64
return base64.b64encode(pickle.dumps(model)).decode('utf-8')
def base64_to_model(encoded_model):
import pickle, base64
return pickle.loads(base64.b64decode(encoded_model))
Tested with several kinds of estimators
from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor
from sklearn.linear_model import LogisticRegression, LinearRegression
from sklearn.svm import SVC, SVR
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.tree import DecisionTreeClassifier
models = [
RandomForestClassifier(n_estimators=10, random_state=42).fit([[1, 2], [3, 4], [5, 6]], [0, 1, 0]),
RandomForestRegressor(n_estimators=10, random_state=42).fit([[1, 2], [3, 4], [5, 6]], [10, 20, 30]),
LogisticRegression().fit([[1, 2], [3, 4], [5, 6]], [0, 1, 0]),
LinearRegression().fit([[1, 2], [3, 4], [5, 6]], [10, 20, 30]),
SVC(probability=True).fit([[1, 2], [3, 4], [5, 6]], [0, 1, 0]),
SVR().fit([[1, 2], [3, 4], [5, 6]], [10, 20, 30]),
KMeans(n_clusters=2, random_state=42).fit([[1, 2], [3, 4], [5, 6]]),
DecisionTreeClassifier().fit([[1, 2], [3, 4], [5, 6]], [0, 1, 0]),
Pipeline([
('scaler', StandardScaler()),
('classifier', LogisticRegression())
]).fit([[1, 2], [3, 4], [5, 6]], [0, 1, 0]),
]
for i, model in enumerate(models):
encoded_model = model_to_base64(model)
print(f"Model {i + 1} Serialized: {encoded_model[:100]}...")
restored_model = base64_to_model(encoded_model)
print(f"Model {i + 1} Restored: {restored_model.__class__.__name__}")
assert isinstance(restored_model, model.__class__)
本文标签: Custom Scikitlearn Model PersistenceStack Overflow
版权声明:本文标题:Custom Scikit-learn Model Persistence - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1742336271a2455690.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论