[Numpy-discussion] Behavior of numpy.copy with sub-classes

Discussion:

Jonathan Helmus

2015-10-20 02:23:18 UTC

In GitHub issue #3474, a number of us have started a conversation on how
NumPy's copy function should behave when passed an instance which is a
sub-class of the array class. Specifically, the issue began by noting
that when a MaskedArray is passed to np.copy, the sub-class is not
passed through but rather a ndarray is returned.

I suggested adding a "subok" parameter which controls how sub-classes
are handled and others suggested having the function call a copy method
on duck arrays. The "subok" parameter is implemented in PR #6509 as an
example. Both of these options would change the API of numpy.copy and
possibly break backwards compatibility. Do others have an opinion of
how np.copy should handle sub-classes?

For a concrete example of this behavior and possible changes, what type
should copy_x be in the following snippet:

import numpy as np
x = np.ma.array([1,2,3])
copy_x = np.copy(x)

Cheers,

- Jonathan Helmus

Nathan Goldbaum

2015-10-20 02:28:26 UTC

Permalink

Post by Jonathan Helmus
In GitHub issue #3474, a number of us have started a conversation on how
NumPy's copy function should behave when passed an instance which is a
sub-class of the array class. Specifically, the issue began by noting that
when a MaskedArray is passed to np.copy, the sub-class is not passed
through but rather a ndarray is returned.
I suggested adding a "subok" parameter which controls how sub-classes are
handled and others suggested having the function call a copy method on duck
arrays. The "subok" parameter is implemented in PR #6509 as an example.
Both of these options would change the API of numpy.copy and possibly break
backwards compatibility. Do others have an opinion of how np.copy should
handle sub-classes?
For a concrete example of this behavior and possible changes, what type
import numpy as np
x = np.ma.array([1,2,3])
copy_x = np.copy(x)

FWIW, it looks like np.copy() is never used in our code to work with the
ndarray subclass we maintain in yt. Instead we use the copy() method much
more often, and that returns the appropriate type. I guess it makes sense
to have the type of the return value of np.copy() agree with the type of
the copy() member function.

That said, breaking backwards compatibility here before numpy 2.0 might
very well break real code. It might be worth it search e.g. github for all
instances of np.copy() to see if they're dealing with subclasses.

Post by Jonathan Helmus
Cheers,
- Jonathan Helmus
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Charles R Harris

2015-10-20 02:40:15 UTC

Permalink

Post by Nathan Goldbaum

FWIW, it looks like np.copy() is never used in our code to work with the
ndarray subclass we maintain in yt. Instead we use the copy() method much
more often, and that returns the appropriate type. I guess it makes sense
to have the type of the return value of np.copy() agree with the type of
the copy() member function.
That said, breaking backwards compatibility here before numpy 2.0 might
very well break real code. It might be worth it search e.g. github for all
instances of np.copy() to see if they're dealing with subclasses.

Benjamin Root

2015-10-20 13:28:32 UTC

Permalink

In many other parts of numpy, calling the numpy function that had an
equivalent array method would result in the method being called. I would
certainly be surprised if the copy() method behaved differently from the
np.copy() function.

Now it is time for me to do some grepping of my code-bases...

On Mon, Oct 19, 2015 at 10:40 PM, Charles R Harris <

Post by Charles R Harris

Post by Nathan Goldbaum

Post by Jonathan Helmus
In GitHub issue #3474, a number of us have started a conversation on how
NumPy's copy function should behave when passed an instance which is a
sub-class of the array class. Specifically, the issue began by noting that
when a MaskedArray is passed to np.copy, the sub-class is not passed
through but rather a ndarray is returned.
I suggested adding a "subok" parameter which controls how sub-classes
are handled and others suggested having the function call a copy method on
duck arrays. The "subok" parameter is implemented in PR #6509 as an
example. Both of these options would change the API of numpy.copy and
possibly break backwards compatibility. Do others have an opinion of how
np.copy should handle sub-classes?
For a concrete example of this behavior and possible changes, what type
import numpy as np
x = np.ma.array([1,2,3])
copy_x = np.copy(x)

FWIW, it looks like np.copy() is never used in our code to work with the
ndarray subclass we maintain in yt. Instead we use the copy() method much
more often, and that returns the appropriate type. I guess it makes sense
to have the type of the return value of np.copy() agree with the type of
the copy() member function.
That said, breaking backwards compatibility here before numpy 2.0 might
very well break real code. It might be worth it search e.g. github for all
instances of np.copy() to see if they're dealing with subclasses.

The problem with github searches is that there are a ton of numpy forks.
ISTR once finding a method to avoid them, but can't remember what is was.
If anyone knows how to do that, I'd appreciate learning.
Chuck
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion