admin管理员组

文章数量:1122832

I want to serialize a class instance in python and keep methods persistent. I have tried with joblib and pickle and am really close with dill, but can't quite get it.

Here is the problem. Say I want to pickle a class instance like so:

import dill

class Test():
    def __init__(self, value=10):
        self.value = value
    def foo(self):
        print(f"Bar! Value is: {self.value}")

t = Test(value=20)
with open('Test.pkl', 'wb+') as fp:
    dill.dump(t, fp)

# Test it
print('Original: ')
t.foo()        # Prints "Bar! Value is: 20"

Later, the definition of Test changes and when I reload my pickled object the method is different:

class Test():
    def __init__(self, value=10):
        self.value = value
    def foo(self):
        print("...not bar?")

with open('Test.pkl', 'rb') as fp:
    t2 = dill.load(fp)

# Test it
print('Reloaded: ')
t2.foo()        # Prints "...not bar?"

Now in the reloaded case, the attribute value is preserved (t2.value is 20). I can get really close to what I want by serializing the class with dill and not the instance, like so:

class Test():
    def __init__(self, value=10):
        self.value = value
    def foo(self):
        print(f"Bar! Value is: {self.value}")

t = Test(value=20)
with open('Test.pkl', 'wb+') as fp:
    dill.dump(Test, fp)

# Test it
print('Original: ')
t.foo()        # Prints "Bar! Value is: 20"

But then when I rebuild it, I get the old method (what I want) but I lose the attributes of the instance t (in this case I get the default value of 10 instead of the instance value of 20):

class Test():
    def __init__(self, value=10):
        self.value = value
    def foo(self):
        print("...not bar?")

with open('Test.pkl', 'rb') as fp:
    test_class = dill.load(fp)
    t2 = test_class()

# Test it
print('Reloaded: ')
t2.foo()        # Prints "Bar! Value is: 10"

In my actual use case, I have a lot of attributes in the class instance. I want to be able to pickle the attributes as well as the methods so that later source code changes don't make that particular object un-recoverable.

Currently to recover these objects I am copying source code files but the imports get very messy--a lot of sys.path manipulations that get confusing to make sure I load the correct old source code. I could also do something where I pickle the class definition with dill and then save all the attributes to json or something and rebuild that way, but I'm wondering if there is an easy way to do this with dill or some other package that I have not yet discovered. Seems like a straightforward use case to me.

I want to serialize a class instance in python and keep methods persistent. I have tried with joblib and pickle and am really close with dill, but can't quite get it.

Here is the problem. Say I want to pickle a class instance like so:

import dill

class Test():
    def __init__(self, value=10):
        self.value = value
    def foo(self):
        print(f"Bar! Value is: {self.value}")

t = Test(value=20)
with open('Test.pkl', 'wb+') as fp:
    dill.dump(t, fp)

# Test it
print('Original: ')
t.foo()        # Prints "Bar! Value is: 20"

Later, the definition of Test changes and when I reload my pickled object the method is different:

class Test():
    def __init__(self, value=10):
        self.value = value
    def foo(self):
        print("...not bar?")

with open('Test.pkl', 'rb') as fp:
    t2 = dill.load(fp)

# Test it
print('Reloaded: ')
t2.foo()        # Prints "...not bar?"

Now in the reloaded case, the attribute value is preserved (t2.value is 20). I can get really close to what I want by serializing the class with dill and not the instance, like so:

class Test():
    def __init__(self, value=10):
        self.value = value
    def foo(self):
        print(f"Bar! Value is: {self.value}")

t = Test(value=20)
with open('Test.pkl', 'wb+') as fp:
    dill.dump(Test, fp)

# Test it
print('Original: ')
t.foo()        # Prints "Bar! Value is: 20"

But then when I rebuild it, I get the old method (what I want) but I lose the attributes of the instance t (in this case I get the default value of 10 instead of the instance value of 20):

class Test():
    def __init__(self, value=10):
        self.value = value
    def foo(self):
        print("...not bar?")

with open('Test.pkl', 'rb') as fp:
    test_class = dill.load(fp)
    t2 = test_class()

# Test it
print('Reloaded: ')
t2.foo()        # Prints "Bar! Value is: 10"

In my actual use case, I have a lot of attributes in the class instance. I want to be able to pickle the attributes as well as the methods so that later source code changes don't make that particular object un-recoverable.

Currently to recover these objects I am copying source code files but the imports get very messy--a lot of sys.path manipulations that get confusing to make sure I load the correct old source code. I could also do something where I pickle the class definition with dill and then save all the attributes to json or something and rebuild that way, but I'm wondering if there is an easy way to do this with dill or some other package that I have not yet discovered. Seems like a straightforward use case to me.

Share Improve this question asked Nov 22, 2024 at 19:06 thehumaneraserthehumaneraser 6204 silver badges21 bronze badges 3
  • This will need a reliable system - with around 100, LoC using both DILL and some other strategy, and some testing to make sure it is working correctly Not sure I can give you a proper answer in my spare time right now - but maybe this can put you on the correct track. – jsbueno Commented Nov 22, 2024 at 19:36
  • If this would work you would have some older versions of classes interacting with newer code in hard to predict ways. – Michael Butscher Commented Nov 22, 2024 at 20:00
  • That is exactly what I need. There are frequent small changes to a class definition and I want to be able to load a pickled old version from file. I have saved old source code where appropriate but manipulating the imports is much more confusing than loading the whole thing from a file – thehumaneraser Commented Nov 25, 2024 at 22:13
Add a comment  | 

2 Answers 2

Reset to default 1

I'm the dill author. dill serializes the class definition with the instance, so you don't need to do it yourself. However, the default behavior is that if the class is updated, then use the updated definition. If you want to load, and ignore (and thus use the stored definition), then use the ignore keyword.

>>> import dill
>>> 
>>> class Test():
...     def __init__(self, value=10):
...         self.value = value
...     def foo(self):
...         print(f"Bar! Value is: {self.value}")
... 
>>> t = Test(value=20)
>>> s = dill.dumps(t)
>>> t.foo()
Bar! Value is: 20
>>> 
>>> class Test():
...     def __init__(self, value=10):
...         self.value = value
...     def foo(self):
...         print("...not bar?")
... 
>>> t2 = dill.loads(s, ignore=True)
>>> t2.foo()
Bar! Value is: 20

Is this what you are looking for?

dill includes dill.settings (also accessible through dump and load) that enable changes to how objects are stored and loaded. recurse=True gives behavior similar to cloudpickle, while byref=True gives behavior similar to pickle.

edit: Maybe this would be relevant: How to pickle a python function with its dependencies?

  • You could try making that work with cloudpickle, below example worked achieving what I believe you're asking.
  • I would strongly suggest writing some tests checking the exact functionality you want to achieve before using it.
  • They mention on their github that this can only be used to send objects between the exact same version of Python.
import cloudpickle

# Initial class
class Test:
    def __init__(self, value=10):
        self.value = value
    def foo(self):
        print(f"Bar! Value is: {self.value}")

t = Test(value=20)
with open('test.pkl', 'wb') as f:
    cloudpickle.dump(t, f)

# Re-definition of class
class Test:
    def __init__(self, value=10):
        self.value = value
    def foo(self):
        print("...not bar?")

# Load original instance
with open('test.pkl', 'rb') as f:
    t2 = cloudpickle.load(f)

t2.foo()
# Bar! Value is: 20

本文标签: How to pickle a class instance with persistent methods in PythonStack Overflow