Discussion:
[Numpy-discussion] in1d, but preserve shape of ar1
Stephan Hoyer
2016-12-20 01:43:41 UTC
Permalink
I think this is a great idea!

I agree that we need a new function. Because the new API is almost strictly
superior, we should try to pick a more general name that we can encourage
users to switch to from in1d.

Pandas calls this method "isin", which I think is a perfectly good name for
the multi-dimensional NumPy version, too:
http://pandas.pydata.org/pandas-docs/stable/generated/
pandas.Series.isin.html

It's a subjective call, but I would probably keep the new function in
arraysetops.py. (This is the sort of question well suited to GitHub rather
than the mailing list, though.)
I started an enhancement request in the Github bug tracker at
https://github.com/numpy/numpy/issues/8331 , but Jaime Frio recommended I
bring it to the mailing list.
`in1d` takes two arrays, `ar1` and `ar2`, and returns a 1d array with the
same number of elements as `ar1`. The logical extension would be a function
that does the same thing but returns a (possibly multi-dimensional) array
of the same shape as `ar1`. The code already has a comment suggesting this
could be done (see https://github.com/numpy/numpy/blob/master/numpy/lib/
arraysetops.py#L444 ).
I agree that changing the behavior of the existing function isn't an
option, since it would break backwards compatability. I'm not sure adding
an option keep_shape is good, since the name of the function ("1d")
wouldn't match what it does (returns an array that might not be 1d). I
return np.in1d(ar1, ar2, **kwargs).reshape(ar1.shape)
the function returns whether each item in `ar1` is in `ar2`. Is "item" or
"element" the right term here?
* Are there any other changes that need to happen in arraysetops.py? Or
other files? I ask this because although the file says "Set operations for
`unique` recently changed to operate on multidimensional arrays, and I'm
proposing a multidimensional version of `in1d`. `ediff1d` could probably be
tweaked into a version that operates along an axis the same way unique does
now, fwiw. Mostly I want to know if I should put my code changes in this
file or somewhere else.
Thanks,
-brsr
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Joseph Fox-Rabinovitz
2016-12-20 13:08:51 UTC
Permalink
Perhaps you could move the code from in1d to your new function and redefine
in1d in terms of it? That may help encourage migration and also make
deprecation easier down the line.

-Joe
Post by Stephan Hoyer
I think this is a great idea!
I agree that we need a new function. Because the new API is almost
strictly superior, we should try to pick a more general name that we can
encourage users to switch to from in1d.
Pandas calls this method "isin", which I think is a perfectly good name
http://pandas.pydata.org/pandas-docs/stable/generated/pandas
.Series.isin.html
It's a subjective call, but I would probably keep the new function in
arraysetops.py. (This is the sort of question well suited to GitHub rather
than the mailing list, though.)
I started an enhancement request in the Github bug tracker at
https://github.com/numpy/numpy/issues/8331 , but Jaime Frio recommended
I bring it to the mailing list.
`in1d` takes two arrays, `ar1` and `ar2`, and returns a 1d array with the
same number of elements as `ar1`. The logical extension would be a function
that does the same thing but returns a (possibly multi-dimensional) array
of the same shape as `ar1`. The code already has a comment suggesting this
could be done (see https://github.com/numpy/
numpy/blob/master/numpy/lib/arraysetops.py#L444 ).
I agree that changing the behavior of the existing function isn't an
option, since it would break backwards compatability. I'm not sure adding
an option keep_shape is good, since the name of the function ("1d")
wouldn't match what it does (returns an array that might not be 1d). I
return np.in1d(ar1, ar2, **kwargs).reshape(ar1.shape)
the function returns whether each item in `ar1` is in `ar2`. Is "item" or
"element" the right term here?
* Are there any other changes that need to happen in arraysetops.py? Or
other files? I ask this because although the file says "Set operations for
`unique` recently changed to operate on multidimensional arrays, and I'm
proposing a multidimensional version of `in1d`. `ediff1d` could probably be
tweaked into a version that operates along an axis the same way unique does
now, fwiw. Mostly I want to know if I should put my code changes in this
file or somewhere else.
Thanks,
-brsr
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Loading...