Anton Akhmerov
2015-07-27 20:10:39 UTC
Hi everyone,
I have encountered an initially rather confusing problem in a piece of
code that attempted to symmetrize a matrix: `h += h.T`
The problem of course appears due to `h.T` being a view of `h`, and
some elements being overwritten during the __iadd__ call.
What made it nasty is that it appeared in code that was already
covered by tests, but due to the way __iadd__ works internally, the
result only became wrong for matrix sizes > 90. Even worse, it was
actually `h += h.T.conj()`, but h was real so conj() was also
returning a view.
I imagine this was already discussed before (also Charles Harris in
https://github.com/numpy/numpy/issues/6119 indicated the issue is
known), but I would like to still reopen the question since I was
unable to find previous iterations of this discussion.
First of all does anyone know of a simple way to test for this issue?
Specifically I'm concerned about all in-place array modifications that
rely on the data being modified. I considered monkey-patching just to
figure out where the problem appears, but numpy.ndarray isn't written
in Python so I'm not sure it can be monkey-patched.
Secondly, is there no way to automatically check for this behavior in
numpy itself? I'm not sure how reliable it is, but would adding
np.may_share_memory be a reasonably reliable and fast check?
This behavior very much c-like, and in my case it was only discovered
by accident in production code that's been out for a couple of years.
Best,
Anton
I have encountered an initially rather confusing problem in a piece of
code that attempted to symmetrize a matrix: `h += h.T`
The problem of course appears due to `h.T` being a view of `h`, and
some elements being overwritten during the __iadd__ call.
What made it nasty is that it appeared in code that was already
covered by tests, but due to the way __iadd__ works internally, the
result only became wrong for matrix sizes > 90. Even worse, it was
actually `h += h.T.conj()`, but h was real so conj() was also
returning a view.
I imagine this was already discussed before (also Charles Harris in
https://github.com/numpy/numpy/issues/6119 indicated the issue is
known), but I would like to still reopen the question since I was
unable to find previous iterations of this discussion.
First of all does anyone know of a simple way to test for this issue?
Specifically I'm concerned about all in-place array modifications that
rely on the data being modified. I considered monkey-patching just to
figure out where the problem appears, but numpy.ndarray isn't written
in Python so I'm not sure it can be monkey-patched.
Secondly, is there no way to automatically check for this behavior in
numpy itself? I'm not sure how reliable it is, but would adding
np.may_share_memory be a reasonably reliable and fast check?
This behavior very much c-like, and in my case it was only discovered
by accident in production code that's been out for a couple of years.
Best,
Anton