4

Code

Suppose I have:

import numpy
import pickle


class Test():
    def __init__(self):
        self.base = numpy.zeros(6)
        self.view = self.base[-3:]

    def __len__(self):
        return len(self.view)

    def update(self):
        self.view[0] += 1

    def add(self):
        self.view = self.base[-len(self.view) - 1:]
        self.view[0] = 1

    def __repr__(self):
        return str(self.view)


def serialize_data():
    data = Test()
    return pickle.dumps(data)

Note that class Test is simply a class that contains a view of a NumPy array base. This view is simply a slice of the last N elements in base (N == 3 on initialization).

Test has a method update() which adds 1 to the value at position 0 of the view, and a method add() which modifies the view size (N = N + 1) and sets the value at position 0 to 1.

The function serialize_data simply creates a Test() instance and then returns the serialized object using pickle.

Behavior

If I create a local variable and update it twice and add it once, everything works as expected:

# Local variable
test = Test()
print(test)    # [ 0.  0.  0.]

test.update()
test.update()
print(test)    # [ 2.  0.  0.]

test.add()
print(test)    # [ 1.  2.  0.  0.]

Now, if I create a local variable out of serialized data, then after executing add the value 2 (set after calling update twice) seems to be lost:

# Serialized variable
data = pickle.loads(serialize_data())
print(data)    # [ 0.  0.  0.]

data.update()
data.update()
print(data)    # [ 2.  0.  0.]

data.add()
print(data)    # [ 1.  0.  0.  0.]  <----  This should be [ 1. 2. 0. 0. ] !!!

Question

Why is this happening and how could I avoid this behavior?

5
  • 1
    The problem is that after pickling/depickling the view is no longer a view into base but has its' own copy of the data. see here, unfortunately, there is no answer on how to prevent this. Commented Feb 2, 2016 at 10:37
  • @kazemakase: with that information I can work-around the problem for my particular use-case. I will try to implement it and answer my own question with the solution (in case it is valid for others in the future). Thank you! :-) PS: please, consider adding your answer as well so I can accept it. Commented Feb 2, 2016 at 10:59
  • I think my comment is a bit meager to qualify for an answer. However, I found a workaround for your particular problem, which I'll post in a minute :) Commented Feb 2, 2016 at 12:05
  • @kazemakase: I think it was good enough! :-D With the __getstate__ and __setstate__ reference links is better, of course. However, the actual implementation varies depending on the use-case (the case I posted was not the real case I was working with). I will accept your answer within the next 24 hours. I like to keep questions opened for a couple of hours in case someone else jumps in with a different approach. ;-) Thanks again! Commented Feb 2, 2016 at 12:22
  • Don't worry about accepting. This is an interesting problem and I'd like to know if there are other ways to solve it. So, just remember to post your own solution if you have a different one :) Commented Feb 2, 2016 at 12:26

1 Answer 1

2

The problem is that after pickling/unpickling the view is no longer a view into base but has its' own copy of the data. See here, unfortunately, there is no answer on how to prevent this.

The particular problem can be overcome by defining __getstate__ and __setstate__ methods for the class that redefine the view after unpickling.

In addition to the view it is necessary to track which part of the base the view looks into. I've chosen to use a slice object, but there are other ways. It is not necessary to pickle the view itself, since it will be reconstructed from the slice upon unpickling.

class Test():
    def __init__(self):
        self.base = numpy.zeros(6)
        self.slice = slice(-3, self.base.size)
        self.view = self.base[self.slice]

    def __len__(self):
        return len(self.view)

    def update(self):
        self.view[0] += 1

    def add(self):
        self.slice = slice(-len(self.view) - 1, self.base.size)
        self.view = self.base[self.slice]        
        self.view[0] = 1

    def __getstate__(self):
        return {'base': self.base, 'slice': self.slice}

    def __setstate__(self, state):
        self.base = state['base']
        self.slice = state['slice']
        self.view = self.base[self.slice]

    def __repr__(self):
        return str(self.view)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.