Discussion:
[Numpy-discussion] How to find indices of values in an array (indirect in1d) ?
Nicolas P. Rougier
2015-12-30 14:45:40 UTC
Permalink
I’m scratching my head around a small problem but I can’t find a vectorized solution.
A = np.array([2,0,1,4])
B = np.array([1,2,0])
print (some_function(A,B))
[1,2,0]

# A[0] == 2 is in B and 2 == B[1] -> 1
# A[1] == 0 is in B and 0 == B[2] -> 2
# A[2] == 1 is in B and 1 == B[0] -> 0

Any idea ? I tried numpy.in1d with no luck.


Nicolas
Andy Ray Terrel
2015-12-30 15:02:00 UTC
Permalink
A = np.array([2,0,1,4])
B = np.array([1,2,0])
s = pd.Series(range(len(B)), index=B)
s[A].values
array([ 1., 2., 0., nan])



On Wed, Dec 30, 2015 at 8:45 AM, Nicolas P. Rougier <
I’m scratching my head around a small problem but I can’t find a
vectorized solution.
I have 2 arrays A and B and I would like to get the indices (relative to
A = np.array([2,0,1,4])
B = np.array([1,2,0])
print (some_function(A,B))
[1,2,0]
# A[0] == 2 is in B and 2 == B[1] -> 1
# A[1] == 0 is in B and 0 == B[2] -> 2
# A[2] == 1 is in B and 1 == B[0] -> 0
Any idea ? I tried numpy.in1d with no luck.
Nicolas
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Benjamin Root
2015-12-30 15:31:00 UTC
Permalink
Maybe use searchsorted()? I will note that I have needed to do something
like this once before, and I found that the list comprehension form of
calling .index() for each item was faster than jumping through hoops to
vectorize it using searchsorted (needing to sort and then map the sorted
indices to the original indices), and was certainly clearer, but that might
depend upon the problem size.

Cheers!
Ben Root
Post by Andy Ray Terrel
A = np.array([2,0,1,4])
B = np.array([1,2,0])
s = pd.Series(range(len(B)), index=B)
s[A].values
array([ 1., 2., 0., nan])
On Wed, Dec 30, 2015 at 8:45 AM, Nicolas P. Rougier <
I’m scratching my head around a small problem but I can’t find a
vectorized solution.
I have 2 arrays A and B and I would like to get the indices (relative to
A = np.array([2,0,1,4])
B = np.array([1,2,0])
print (some_function(A,B))
[1,2,0]
# A[0] == 2 is in B and 2 == B[1] -> 1
# A[1] == 0 is in B and 0 == B[2] -> 2
# A[2] == 1 is in B and 1 == B[0] -> 0
Any idea ? I tried numpy.in1d with no luck.
Nicolas
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Nicolas P. Rougier
2015-12-30 16:12:40 UTC
Permalink
Thanks for the quick answers. I think I will go with the .index and list comprehension.
But if someone finds with a vectorised solution for the numpy 100 exercises...


Nicolas
Maybe use searchsorted()? I will note that I have needed to do something like this once before, and I found that the list comprehension form of calling .index() for each item was faster than jumping through hoops to vectorize it using searchsorted (needing to sort and then map the sorted indices to the original indices), and was certainly clearer, but that might depend upon the problem size.
Cheers!
Ben Root
A = np.array([2,0,1,4])
B = np.array([1,2,0])
s = pd.Series(range(len(B)), index=B)
s[A].values
array([ 1., 2., 0., nan])
I’m scratching my head around a small problem but I can’t find a vectorized solution.
A = np.array([2,0,1,4])
B = np.array([0,2,0])
print (some_function(A,B))
[1,2,0]
# A[0] == 2 is in B and 2 == B[1] -> 1
# A[1] == 0 is in B and 0 == B[2] -> 2
# A[2] == 1 is in B and 1 == B[0] -> 0
Any idea ? I tried numpy.in1d with no luck.
Nicolas
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Sebastian Berg
2015-12-30 16:47:44 UTC
Permalink
Post by Nicolas P. Rougier
Thanks for the quick answers. I think I will go with the .index and list comprehension.
But if someone finds with a vectorised solution for the numpy 100 exercises...
Yeah, I doubt you can get very pretty, though maybe there is some great
trick. This is one way:

In [67]: A = np.array([2,0,1,4])
In [68]: B = np.array([1,2,0])
In [69]: B_sorter = np.argsort(B)
In [70]: B_index = np.searchsorted(B, A, sorter=B_sorter)
In [71]: invalid = B[B_sorter].take(s, mode='clip') != A
In [72]: B_index[invalid] = -1 # mark invalids with -1
In [73]: B_index
Out[73]: array([ 2, 0, 1, -1])

Anyway, I guess the arrays would likely have to be quite large for this
to beat list comprehension. And maybe doing the searchsorted the other
way around could be faster, no idea.

- Sebastian
Post by Nicolas P. Rougier
Nicolas
Post by Benjamin Root
Maybe use searchsorted()? I will note that I have needed to do
something like this once before, and I found that the list
comprehension form of calling .index() for each item was faster
than jumping through hoops to vectorize it using searchsorted
(needing to sort and then map the sorted indices to the original
indices), and was certainly clearer, but that might depend upon the
problem size.
Cheers!
Ben Root
On Wed, Dec 30, 2015 at 10:02 AM, Andy Ray Terrel <
A = np.array([2,0,1,4])
B = np.array([1,2,0])
s = pd.Series(range(len(B)), index=B)
s[A].values
array([ 1., 2., 0., nan])
On Wed, Dec 30, 2015 at 8:45 AM, Nicolas P. Rougier <
I’m scratching my head around a small problem but I can’t find a
vectorized solution.
I have 2 arrays A and B and I would like to get the indices
A = np.array([2,0,1,4])
B = np.array([0,2,0])
print (some_function(A,B))
[1,2,0]
# A[0] == 2 is in B and 2 == B[1] -> 1
# A[1] == 0 is in B and 0 == B[2] -> 2
# A[2] == 1 is in B and 1 == B[0] -> 0
Any idea ? I tried numpy.in1d with no luck.
Nicolas
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Nicolas P. Rougier
2015-12-30 18:14:46 UTC
Permalink
Thanks, I will make some benchmark and post results.
Post by Sebastian Berg
Post by Nicolas P. Rougier
Thanks for the quick answers. I think I will go with the .index and list comprehension.
But if someone finds with a vectorised solution for the numpy 100 exercises...
Yeah, I doubt you can get very pretty, though maybe there is some great
In [67]: A = np.array([2,0,1,4])
In [68]: B = np.array([1,2,0])
In [69]: B_sorter = np.argsort(B)
In [70]: B_index = np.searchsorted(B, A, sorter=B_sorter)
In [71]: invalid = B[B_sorter].take(s, mode='clip') != A
In [72]: B_index[invalid] = -1 # mark invalids with -1
In [73]: B_index
Out[73]: array([ 2, 0, 1, -1])
Anyway, I guess the arrays would likely have to be quite large for this
to beat list comprehension. And maybe doing the searchsorted the other
way around could be faster, no idea.
- Sebastian
Post by Nicolas P. Rougier
Nicolas
Post by Benjamin Root
Maybe use searchsorted()? I will note that I have needed to do
something like this once before, and I found that the list
comprehension form of calling .index() for each item was faster
than jumping through hoops to vectorize it using searchsorted
(needing to sort and then map the sorted indices to the original
indices), and was certainly clearer, but that might depend upon the
problem size.
Cheers!
Ben Root
On Wed, Dec 30, 2015 at 10:02 AM, Andy Ray Terrel <
A = np.array([2,0,1,4])
B = np.array([1,2,0])
s = pd.Series(range(len(B)), index=B)
s[A].values
array([ 1., 2., 0., nan])
On Wed, Dec 30, 2015 at 8:45 AM, Nicolas P. Rougier <
I’m scratching my head around a small problem but I can’t find a
vectorized solution.
I have 2 arrays A and B and I would like to get the indices
A = np.array([2,0,1,4])
B = np.array([0,2,0])
print (some_function(A,B))
[1,2,0]
# A[0] == 2 is in B and 2 == B[1] -> 1
# A[1] == 0 is in B and 0 == B[2] -> 2
# A[2] == 1 is in B and 1 == B[0] -> 0
Any idea ? I tried numpy.in1d with no luck.
Nicolas
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Mark Miller
2015-12-30 17:42:37 UTC
Permalink
I'm not 100% sure that I get the question, but does this help at all?
a = numpy.array([3,2,8,7])
b = numpy.array([1,3,2,4,5,7,6,8,9])
c = set(a) & set(b)
c #contains elements of a that are in b (and vice versa)
set([8, 2, 3, 7])
indices = numpy.where([x in c for x in b])[0]
indices #indices of b where the elements of a in b occur
array([1, 2, 5, 7], dtype=int64)

-Mark


On Wed, Dec 30, 2015 at 6:45 AM, Nicolas P. Rougier <
I’m scratching my head around a small problem but I can’t find a
vectorized solution.
I have 2 arrays A and B and I would like to get the indices (relative to
A = np.array([2,0,1,4])
B = np.array([1,2,0])
print (some_function(A,B))
[1,2,0]
# A[0] == 2 is in B and 2 == B[1] -> 1
# A[1] == 0 is in B and 0 == B[2] -> 2
# A[2] == 1 is in B and 1 == B[0] -> 0
Any idea ? I tried numpy.in1d with no luck.
Nicolas
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Nicolas P. Rougier
2015-12-30 18:17:16 UTC
Permalink
Yes, it is the expected result. Thanks.
Maybe the set(a) & set(b) can be replaced by np.where[np.in1d(a,b)], no ?
Post by Mark Miller
I'm not 100% sure that I get the question, but does this help at all?
a = numpy.array([3,2,8,7])
b = numpy.array([1,3,2,4,5,7,6,8,9])
c = set(a) & set(b)
c #contains elements of a that are in b (and vice versa)
set([8, 2, 3, 7])
indices = numpy.where([x in c for x in b])[0]
indices #indices of b where the elements of a in b occur
array([1, 2, 5, 7], dtype=int64)
-Mark
I’m scratching my head around a small problem but I can’t find a vectorized solution.
A = np.array([2,0,1,4])
B = np.array([1,2,0])
print (some_function(A,B))
[1,2,0]
# A[0] == 2 is in B and 2 == B[1] -> 1
# A[1] == 0 is in B and 0 == B[2] -> 2
# A[2] == 1 is in B and 1 == B[0] -> 0
Any idea ? I tried numpy.in1d with no luck.
Nicolas
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Mark Miller
2015-12-30 18:40:15 UTC
Permalink
I was not familiar with the .in1d function. That's pretty handy.

Yes...it looks like numpy.where(numpy.in1d(b, a)) does what you need.
Post by Nicolas P. Rougier
Post by Mark Miller
numpy.where(numpy.in1d(b, a))
(array([1, 2, 5, 7], dtype=int64),)
It would be interesting to see the benchmarks.


On Wed, Dec 30, 2015 at 10:17 AM, Nicolas P. Rougier <
Post by Nicolas P. Rougier
Yes, it is the expected result. Thanks.
Maybe the set(a) & set(b) can be replaced by np.where[np.in1d(a,b)], no ?
Post by Mark Miller
I'm not 100% sure that I get the question, but does this help at all?
a = numpy.array([3,2,8,7])
b = numpy.array([1,3,2,4,5,7,6,8,9])
c = set(a) & set(b)
c #contains elements of a that are in b (and vice versa)
set([8, 2, 3, 7])
indices = numpy.where([x in c for x in b])[0]
indices #indices of b where the elements of a in b occur
array([1, 2, 5, 7], dtype=int64)
-Mark
On Wed, Dec 30, 2015 at 6:45 AM, Nicolas P. Rougier <
I’m scratching my head around a small problem but I can’t find a
vectorized solution.
Post by Mark Miller
I have 2 arrays A and B and I would like to get the indices (relative to
A = np.array([2,0,1,4])
B = np.array([1,2,0])
print (some_function(A,B))
[1,2,0]
# A[0] == 2 is in B and 2 == B[1] -> 1
# A[1] == 0 is in B and 0 == B[2] -> 2
# A[2] == 1 is in B and 1 == B[0] -> 0
Any idea ? I tried numpy.in1d with no luck.
Nicolas
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Nicolas P. Rougier
2015-12-30 18:51:02 UTC
Permalink
Unfortunately, this does not handle repeated entries in a.
Post by Mark Miller
I was not familiar with the .in1d function. That's pretty handy.
Yes...it looks like numpy.where(numpy.in1d(b, a)) does what you need.
Post by Mark Miller
Post by Mark Miller
numpy.where(numpy.in1d(b, a))
(array([1, 2, 5, 7], dtype=int64),)
It would be interesting to see the benchmarks.
Yes, it is the expected result. Thanks.
Maybe the set(a) & set(b) can be replaced by np.where[np.in1d(a,b)], no ?
Post by Mark Miller
I'm not 100% sure that I get the question, but does this help at all?
Post by Mark Miller
a = numpy.array([3,2,8,7])
b = numpy.array([1,3,2,4,5,7,6,8,9])
c = set(a) & set(b)
c #contains elements of a that are in b (and vice versa)
set([8, 2, 3, 7])
Post by Mark Miller
indices = numpy.where([x in c for x in b])[0]
indices #indices of b where the elements of a in b occur
array([1, 2, 5, 7], dtype=int64)
-Mark
I’m scratching my head around a small problem but I can’t find a vectorized solution.
Post by Mark Miller
A = np.array([2,0,1,4])
B = np.array([1,2,0])
print (some_function(A,B))
[1,2,0]
# A[0] == 2 is in B and 2 == B[1] -> 1
# A[1] == 0 is in B and 0 == B[2] -> 2
# A[2] == 1 is in B and 1 == B[0] -> 0
Any idea ? I tried numpy.in1d with no luck.
Nicolas
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Nicolas P. Rougier
2015-12-30 19:21:48 UTC
Permalink
In the end, I’ve only the list comprehension to work as expected

A = [0,0,1,3]
B = np.arange(8)
np.random.shuffle(B)
I = [list(B).index(item) for item in A if item in B]


But Mark's and Sebastian's methods do not seem to work...
Post by Nicolas P. Rougier
Unfortunately, this does not handle repeated entries in a.
Post by Mark Miller
I was not familiar with the .in1d function. That's pretty handy.
Yes...it looks like numpy.where(numpy.in1d(b, a)) does what you need.
Post by Mark Miller
Post by Mark Miller
numpy.where(numpy.in1d(b, a))
(array([1, 2, 5, 7], dtype=int64),)
It would be interesting to see the benchmarks.
Yes, it is the expected result. Thanks.
Maybe the set(a) & set(b) can be replaced by np.where[np.in1d(a,b)], no ?
Post by Mark Miller
I'm not 100% sure that I get the question, but does this help at all?
Post by Mark Miller
a = numpy.array([3,2,8,7])
b = numpy.array([1,3,2,4,5,7,6,8,9])
c = set(a) & set(b)
c #contains elements of a that are in b (and vice versa)
set([8, 2, 3, 7])
Post by Mark Miller
indices = numpy.where([x in c for x in b])[0]
indices #indices of b where the elements of a in b occur
array([1, 2, 5, 7], dtype=int64)
-Mark
I’m scratching my head around a small problem but I can’t find a vectorized solution.
Post by Mark Miller
A = np.array([2,0,1,4])
B = np.array([1,2,0])
print (some_function(A,B))
[1,2,0]
# A[0] == 2 is in B and 2 == B[1] -> 1
# A[1] == 0 is in B and 0 == B[2] -> 2
# A[2] == 1 is in B and 1 == B[0] -> 0
Any idea ? I tried numpy.in1d with no luck.
Nicolas
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Sebastian Berg
2015-12-30 20:13:39 UTC
Permalink
In the end, I’ve only the list comprehension to work as expected
A = [0,0,1,3]
B = np.arange(8)
np.random.shuffle(B)
I = [list(B).index(item) for item in A if item in B]
But Mark's and Sebastian's methods do not seem to work...
Yeah, sorry had a mind slip with the sorter since it returns the sorted
version. I think this should do the correct thing (throws away invalid
ones as default, though I think it is a bad idea in general).

def index(A, B, fill_invalid=None):
B_sorter = np.argsort(B)
B_sorted = B[B_sorter]
B_sorted_index = np.searchsorted(B_sorted, A)
# Go back into the original index:
B_index = B_sorter[B_sorted_index]

if fill_invalid is None:
valid = B.take(B_index, mode='clip') == A
return B_index[valid]
else:
invalid = B.take(B_index, mode='clip') != A

B_index[invalid] = fill_invalid
return B_index
On 30 Dec 2015, at 19:51, Nicolas P. Rougier <
Unfortunately, this does not handle repeated entries in a.
Post by Mark Miller
I was not familiar with the .in1d function. That's pretty handy.
Yes...it looks like numpy.where(numpy.in1d(b, a)) does what you need.
Post by Mark Miller
numpy.where(numpy.in1d(b, a))
(array([1, 2, 5, 7], dtype=int64),)
It would be interesting to see the benchmarks.
On Wed, Dec 30, 2015 at 10:17 AM, Nicolas P. Rougier <
Yes, it is the expected result. Thanks.
Maybe the set(a) & set(b) can be replaced by
np.where[np.in1d(a,b)], no ?
On 30 Dec 2015, at 18:42, Mark Miller <
I'm not 100% sure that I get the question, but does this help at all?
Post by Mark Miller
a = numpy.array([3,2,8,7])
b = numpy.array([1,3,2,4,5,7,6,8,9])
c = set(a) & set(b)
c #contains elements of a that are in b (and vice versa)
set([8, 2, 3, 7])
Post by Mark Miller
indices = numpy.where([x in c for x in b])[0]
indices #indices of b where the elements of a in b occur
array([1, 2, 5, 7], dtype=int64)
-Mark
On Wed, Dec 30, 2015 at 6:45 AM, Nicolas P. Rougier <
I’m scratching my head around a small problem but I can’t find
a vectorized solution.
I have 2 arrays A and B and I would like to get the indices
Post by Mark Miller
A = np.array([2,0,1,4])
B = np.array([1,2,0])
print (some_function(A,B))
[1,2,0]
# A[0] == 2 is in B and 2 == B[1] -> 1
# A[1] == 0 is in B and 0 == B[2] -> 2
# A[2] == 1 is in B and 1 == B[0] -> 0
Any idea ? I tried numpy.in1d with no luck.
Nicolas
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Peter Creasey
2015-12-30 20:08:51 UTC
Permalink
In the end, I?ve only the list comprehension to work as expected
A = [0,0,1,3]
B = np.arange(8)
np.random.shuffle(B)
I = [list(B).index(item) for item in A if item in B]
But Mark's and Sebastian's methods do not seem to work...
The function you want is also in the open source astronomy package
iccpy ( https://github.com/Lowingbn/iccpy ), which essentially does a
variant of Sebastian’s code (which I also couldn’t quite get working),
and handles a few things like old numpy versions (pre 1.4) and allows
you to specify if B is already sorted.
from iccpy.utils import match
print match(A,B)
[ 1 2 0 -1]

Peter
Loading...