Discussion:
[Numpy-discussion] np.in1d() & sets, bug?
Benjamin Root
2015-08-10 16:09:13 UTC
Permalink
np.in1d([1], set([0, 1, 2]), assume_unique=True)
array([ False], dtype=bool)
np.in1d([1], [0, 1, 2], assume_unique=True)
array([ True], dtype=bool)

I am assuming this has something to do with the fact that order is not
guaranteed with set() objects? I was kind of hoping that setting
"assume_unique=True" would be sufficient to overcome that problem. Should
sets be rejected as an error?

This was using v1.9.0

Cheers!
Ben Root
Sebastian Berg
2015-08-10 17:10:04 UTC
Permalink
Post by Benjamin Root
np.in1d([1], set([0, 1, 2]), assume_unique=True)
array([ False], dtype=bool)
np.in1d([1], [0, 1, 2], assume_unique=True)
array([ True], dtype=bool)
I am assuming this has something to do with the fact that order is not
guaranteed with set() objects? I was kind of hoping that setting
"assume_unique=True" would be sufficient to overcome that problem.
Should sets be rejected as an error?
Not really, it is "simply" because ``np.asarray(set([1, 2, 3]))``
returns an object array and 1 is not the same as ``set([1, 2, 3])``.

I think earlier numpy versions may have had "short cuts" for short lists
or something so this may have worked in some cases....

- Sebastian
Post by Benjamin Root
This was using v1.9.0
Cheers!
Ben Root
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Nathaniel Smith
2015-08-10 17:38:18 UTC
Permalink
Another case where refusing to implicitly create object arrays would have
avoided a lot of confusion...
Post by Sebastian Berg
Post by Benjamin Root
np.in1d([1], set([0, 1, 2]), assume_unique=True)
array([ False], dtype=bool)
np.in1d([1], [0, 1, 2], assume_unique=True)
array([ True], dtype=bool)
I am assuming this has something to do with the fact that order is not
guaranteed with set() objects? I was kind of hoping that setting
"assume_unique=True" would be sufficient to overcome that problem.
Should sets be rejected as an error?
Not really, it is "simply" because ``np.asarray(set([1, 2, 3]))``
returns an object array and 1 is not the same as ``set([1, 2, 3])``.
I think earlier numpy versions may have had "short cuts" for short lists
or something so this may have worked in some cases....
- Sebastian
Post by Benjamin Root
This was using v1.9.0
Cheers!
Ben Root
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Benjamin Root
2015-08-10 17:40:38 UTC
Permalink
Post by Sebastian Berg
Not really, it is "simply" because ``np.asarray(set([1, 2, 3]))``
returns an object array
Holy crap! To be pedantic, it looks like it turns it into a numpy scalar,
but still! I wouldn't have expected np.asarray() on a set (or dictionary,
for that matter) to work because order is not guaranteed. Is this expected
behavior?

Digging into the implementation of in1d(), I can see now how passing a
set() wouldn't be useful at all (as an aside, pretty clever algorithm). I
know sets aren't array-like, but the code that used this seemed to work at
first, and this problem wasn't revealed until I created some unit tests to
exercise some possible corner cases. Silently producing possibly erroneous
results is dangerous. Don't know if better documentation or some better
sanity checking would be called for here, though.

Ben Root
Post by Sebastian Berg
Post by Benjamin Root
np.in1d([1], set([0, 1, 2]), assume_unique=True)
array([ False], dtype=bool)
np.in1d([1], [0, 1, 2], assume_unique=True)
array([ True], dtype=bool)
I am assuming this has something to do with the fact that order is not
guaranteed with set() objects? I was kind of hoping that setting
"assume_unique=True" would be sufficient to overcome that problem.
Should sets be rejected as an error?
Not really, it is "simply" because ``np.asarray(set([1, 2, 3]))``
returns an object array and 1 is not the same as ``set([1, 2, 3])``.
I think earlier numpy versions may have had "short cuts" for short lists
or something so this may have worked in some cases....
- Sebastian
Post by Benjamin Root
This was using v1.9.0
Cheers!
Ben Root
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
j***@gmail.com
2015-08-10 18:08:07 UTC
Permalink
Post by Benjamin Root
Post by Sebastian Berg
Not really, it is "simply" because ``np.asarray(set([1, 2, 3]))``
returns an object array
Holy crap! To be pedantic, it looks like it turns it into a numpy scalar,
but still! I wouldn't have expected np.asarray() on a set (or dictionary,
for that matter) to work because order is not guaranteed. Is this expected
behavior?
Digging into the implementation of in1d(), I can see now how passing a
set() wouldn't be useful at all (as an aside, pretty clever algorithm). I
know sets aren't array-like, but the code that used this seemed to work at
first, and this problem wasn't revealed until I created some unit tests to
exercise some possible corner cases. Silently producing possibly erroneous
results is dangerous. Don't know if better documentation or some better
sanity checking would be called for here, though.
Ben Root
On Mon, Aug 10, 2015 at 1:10 PM, Sebastian Berg <
Post by Sebastian Berg
Post by Benjamin Root
np.in1d([1], set([0, 1, 2]), assume_unique=True)
array([ False], dtype=bool)
np.in1d([1], [0, 1, 2], assume_unique=True)
array([ True], dtype=bool)
I am assuming this has something to do with the fact that order is not
guaranteed with set() objects? I was kind of hoping that setting
"assume_unique=True" would be sufficient to overcome that problem.
Should sets be rejected as an error?
Not really, it is "simply" because ``np.asarray(set([1, 2, 3]))``
returns an object array and 1 is not the same as ``set([1, 2, 3])``.
I think earlier numpy versions may have had "short cuts" for short lists
or something so this may have worked in some cases....
is it possible to get at least a UserWarning when creating an object array
and dtype object hasn't been explicitly requested or underlying data is
already in an object dtype?


Josef
Post by Benjamin Root
Post by Sebastian Berg
- Sebastian
Post by Benjamin Root
This was using v1.9.0
Cheers!
Ben Root
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Continue reading on narkive:
Loading...