Discussion:
[Numpy-discussion] Numpy helper function for __getitem__?
Fabien
2015-08-23 17:54:29 UTC
Permalink
Folks,

My search engine was not able to help me on this one, possibly because I
don't know exactly *what* I am looking for.

I need to override __getitem__ for a class that wrapps a numpy array. I
know the dimensions of my array (which can be variable from instance to
instance), and I know what I want to do: for one preselected dimension,
I need to select another slice than requested by the user, do something
with the data, and return the variable.

I am looking for a function that helps me to "clean" the input of
__getitem__. There are so many possible cases, when the user uses [:] or
[..., 1:2] or [0, ..., :] and so forth. But all these cases have an
equivalent index array of len(ndimensions) with only valid slice()
objects in it. This array would be much easier for me to work with.

in pseudo code:

def __getitem__(self, item):
# clean input
item = np.clean_item(item, ndimensions=4)
# Ok now item is guaranteed to be of len 4
item[2] = slice()
# Continue
etc.

Is there such a function in numpy?

I hope I have been clear enough... Thanks a lot!

Fabien
Stephan Hoyer
2015-08-23 18:08:02 UTC
Permalink
I don't think NumPy has a function like this (at least, not exposed to Python), but I wrote one for xray, "expanded_indexer", that you are welcome to borrow:
https://github.com/xray/xray/blob/v0.6.0/xray/core/indexing.py#L10










​Stephan



On Sunday, Aug 23, 2015 at 7:54 PM, Fabien <***@gmail.com>, wrote:
Folks,


My search engine was not able to help me on this one, possibly because I

don't know exactly *what* I am looking for.


I need to override __getitem__ for a class that wrapps a numpy array. I

know the dimensions of my array (which can be variable from instance to

instance), and I know what I want to do: for one preselected dimension,

I need to select another slice than requested by the user, do something

with the data, and return the variable.


I am looking for a function that helps me to "clean" the input of

__getitem__. There are so many possible cases, when the user uses [:] or

[..., 1:2] or [0, ..., :] and so forth. But all these cases have an

equivalent index array of len(ndimensions) with only valid slice()

objects in it. This array would be much easier for me to work with.


in pseudo code:


def __getitem__(self, item):

# clean input

item = np.clean_item(item, ndimensions=4)

# Ok now item is guaranteed to be of len 4

item[2] = slice()

# Continue

etc.


Is there such a function in numpy?


I hope I have been clear enough... Thanks a lot!


Fabien


_______________________________________________

NumPy-Discussion mailing list

NumPy-***@scipy.org

http://mail.scipy.org/mailman/listinfo/numpy-discussion
Fabien
2015-08-23 21:24:06 UTC
Permalink
Post by Stephan Hoyer
I don't think NumPy has a function like this (at least, not exposed to
Python), but I wrote one for xray, "expanded_indexer", that you are
Hi Stephan,

that's perfect, thanks!

Fabien
Sebastian Berg
2015-08-24 08:23:22 UTC
Permalink
Post by Stephan Hoyer
I don't think NumPy has a function like this (at least, not exposed to
Python), but I wrote one for xray, "expanded_indexer", that you are
https://github.com/xray/xray/blob/v0.6.0/xray/core/indexing.py#L10
Yeah, we have no such functionality. We do have a function which does
all of this in C but it is somewhat more complex not exposed in any
case.
That function seems nice, though on its own not complete? It does not
seem to handle `np.newaxis`/`None` or boolean indexing arrays well.
One other thing which is not really important, we are deprecating the
use of multiple ellipsis.

Fabien, just to make sure you are aware. If you are overriding
`__getitem__`, you should also implement `__setitem__`. NumPy does some
magic if you do not. That will seem to make `__setitem__` work fine, but
breaks down if you have advanced indexing involved (or if you return
copies, though it spits warnings in that case).

- Sebastian
Post by Stephan Hoyer
​Stephan
On Sunday, Aug 23, 2015 at 7:54 PM, Fabien
Folks,
My search engine was not able to help me on this one, possibly because I
don't know exactly *what* I am looking for.
I need to override __getitem__ for a class that wrapps a numpy array. I
know the dimensions of my array (which can be variable from instance to
instance), and I know what I want to do: for one preselected dimension,
I need to select another slice than requested by the user, do something
with the data, and return the variable.
I am looking for a function that helps me to "clean" the input of
__getitem__. There are so many possible cases, when the user uses [:] or
[..., 1:2] or [0, ..., :] and so forth. But all these cases have an
equivalent index array of len(ndimensions) with only valid slice()
objects in it. This array would be much easier for me to work with.
# clean input
item = np.clean_item(item, ndimensions=4)
# Ok now item is guaranteed to be of len 4
item[2] = slice()
# Continue
etc.
Is there such a function in numpy?
I hope I have been clear enough... Thanks a lot!
Fabien
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Fabien
2015-08-25 15:41:36 UTC
Permalink
Post by Sebastian Berg
Fabien, just to make sure you are aware. If you are overriding
`__getitem__`, you should also implement `__setitem__`. NumPy does some
magic if you do not. That will seem to make `__setitem__` work fine, but
breaks down if you have advanced indexing involved (or if you return
copies, though it spits warnings in that case).
Hi Sebastian,

thanks for the info. I am writing a duck NetCDF4 Variable object, and
therefore I am not trying to override Numpy arrays.

I think that Stephan's function for xray is very useful. A possible
improvement (probably at a certain performance cost) would be to be able
to provide a shape instead of a number of dimensions. The output would
then be slices with valid start and ends.

Current behavior:
In[9]: expanded_indexer(slice(None), 2)
Out[9]: (slice(None, None, None), slice(None, None, None))

With shape:
In[9]: expanded_indexer(slice(None), (3, 4))
Out[9]: (slice(0, 4, 1), slice(0, 5, 1))

But if nobody needed something like this before me, I think that I might
have a design problem in my code (still quite new to python).

Cheers and thanks,

Fabien
Stephan Hoyer
2015-08-26 17:59:32 UTC
Permalink
Indeed, the helper function I wrote for xray was not designed to handle
None/np.newaxis or non-1d Boolean indexers, because those are not valid
indexers for xray objects. I think it could be straightforwardly extended
to handle None simply by not counting them towards the total number of
dimensions.
Post by Fabien
I think that Stephan's function for xray is very useful. A possible
improvement (probably at a certain performance cost) would be to be able
to provide a shape instead of a number of dimensions. The output would
then be slices with valid start and ends.
In[9]: expanded_indexer(slice(None), 2)
Out[9]: (slice(None, None, None), slice(None, None, None))
In[9]: expanded_indexer(slice(None), (3, 4))
Out[9]: (slice(0, 4, 1), slice(0, 5, 1))
But if nobody needed something like this before me, I think that I might
have a design problem in my code (still quite new to python).
Glad you found it helpful!

Python's slice object has the indices method which implements this logic,
e.g.,

In [15]: s = slice(None, 10)

In [16]: s.indices(100)
Out[16]: (0, 10, 1)

Cheers,
Stephan
Phil Elson
2015-08-29 07:55:41 UTC
Permalink
Biggus also has such a function:
https://github.com/SciTools/biggus/blob/master/biggus/__init__.py#L2878
It handles newaxis outside of that function in:
https://github.com/SciTools/biggus/blob/master/biggus/__init__.py#L537.

Again, it only aims to deal with orthogonal array indexing, not numpy fancy
indexing.

I'd be surprised if Dask.array didn't have a similar function too.


HTH
Post by Stephan Hoyer
Indeed, the helper function I wrote for xray was not designed to handle
None/np.newaxis or non-1d Boolean indexers, because those are not valid
indexers for xray objects. I think it could be straightforwardly extended
to handle None simply by not counting them towards the total number of
dimensions.
Post by Fabien
I think that Stephan's function for xray is very useful. A possible
improvement (probably at a certain performance cost) would be to be able
to provide a shape instead of a number of dimensions. The output would
then be slices with valid start and ends.
In[9]: expanded_indexer(slice(None), 2)
Out[9]: (slice(None, None, None), slice(None, None, None))
In[9]: expanded_indexer(slice(None), (3, 4))
Out[9]: (slice(0, 4, 1), slice(0, 5, 1))
But if nobody needed something like this before me, I think that I might
have a design problem in my code (still quite new to python).
Glad you found it helpful!
Python's slice object has the indices method which implements this logic,
e.g.,
In [15]: s = slice(None, 10)
In [16]: s.indices(100)
Out[16]: (0, 10, 1)
Cheers,
Stephan
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Chris Barker
2015-09-02 16:45:48 UTC
Permalink
Post by Phil Elson
https://github.com/SciTools/biggus/blob/master/biggus/__init__.py#L2878
https://github.com/SciTools/biggus/blob/master/biggus/__init__.py#L537.
Again, it only aims to deal with orthogonal array indexing, not numpy
fancy indexing.
I'd be surprised if Dask.array didn't have a similar function too.
This all indicates to me that this would be a great thing to have as a
stand alone project, or a utility shipped with numpy.

It's been said that you really don't want to subclass ndarray, and should
rather, wrap and delicate (or duck-type) -- maybe this is a good time to
provide utilities to make it easier to do so.

-Chris

-----

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

***@noaa.gov

Loading...