It seems pickle keeps track of references for basic python types.
x = [1]
y = [x]
x,y = pickle.loads(pickle.dumps((x,y)))
x.append(2)
print(y)
Numpy arrays are different but references are forgotten after
pickle/unpickle. Shared objects do not remain shared. Based on the quote
below it could be considered bug with numpy/pickle.
Object sharing (references to the same object in different places): This is
similar to self-referencing objects; pickle stores the object once, and
ensures that all other references point to the master copy. Shared objects
remain shared, which can be very important for mutable objects. link
<https://docs.python.org/2.0/lib/module-pickle.html>
Another example with ndarrays:
x = np.arange(5)
y = x[::-1]
x, y = pickle.loads(pickle.dumps((x, y)))
x[0] = 9
print(y)
In this case the two arrays share the exact same object for the data buffer
(although object might not be the right word here)
Post by Robert KernPost by Stephan Hoyerbase = np.zeros(100000000)
view = base[:10]
# case 1
pickle.dump(view, file)
# case 2
pickle.dump(base, file)
pickle.dump(view, file)
# case 3
pickle.dump(view, file)
pickle.dump(base, file)
?
I see what you're getting at here. We would need a rule for when to
include the base in the pickle and when not to. Otherwise,
pickle.dump(view, file) always contains data from the base pickle, even
with view is much smaller than base.
Post by Stephan HoyerThe safe answer is "only use views in the pickle when base is already
being pickled", but that isn't possible to check unless all the arrays are
together in a custom container. So, this isn't really feasible for NumPy.
It would be possible with a custom Pickler/Unpickler since they already
keep track of objects previously (un)pickled. That would handle [base,
view] okay but not [view, base], so it's probably not going to be all that
useful outside of special situations. It would make a neat recipe, but I
probably would not provide it in numpy itself.
--
Robert Kern
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion