Discussion:
[Numpy-discussion] floats for indexing, reshape - too strict ?
j***@gmail.com
2015-07-01 14:05:23 UTC
Permalink
About the deprecation warning for using another type than integers, in
ones, reshape, indexing and so on:

Wouldn't it be nicer to accept floats that are equal to an integer?

for example
5.0 == 5
True
np.ones(10 / 2)
array([ 1., 1., 1., 1., 1.])
10 / 2 == 5
True

or the python 2 version
np.ones(10. / 2)
array([ 1., 1., 1., 1., 1.])
10. / 2 == 5
True

I'm using now 10 // 2, or int(10./2 + 1) but this is unconditional and
doesn't raise if the numbers are not close or equal to an integer (which
would be a bug)


Josef
Sebastian Berg
2015-07-01 14:32:10 UTC
Permalink
Post by j***@gmail.com
About the deprecation warning for using another type than integers, in
Wouldn't it be nicer to accept floats that are equal to an integer?
Hmmm, the biggest point was that the old solution was to basically
(besides strings) use `int(...)`, which means it does not raise any
errors as you also mention.
I am open to think about allowing exact floats for most of this
(frankly, not advanced indexing at least for the moment, but we never
did there), I think scipy may be doing that for some functions?

The disadvantage I see is, that some weirder calculations would possible
work most of the times, but not always, what I mean is such a case.
A -- possibly silly -- example:

In [8]: for i in range(10):
...: print i, i == i * 0.1 * 10
...:
0 True
1 True
2 True
3 False
4 True
5 True
6 False
7 False
8 True
9 True

I am somewhat opposed to rounding a lot (i.e. not noticing if you got
3.3333 somewhere), so not sure if you can define a "tolerance"
reasonable here unless it is exact. Though I guess you are right that
`//` will also just round silently already.

- Sebastian
Post by j***@gmail.com
for example
5.0 == 5
True
np.ones(10 / 2)
array([ 1., 1., 1., 1., 1.])
10 / 2 == 5
True
or the python 2 version
np.ones(10. / 2)
array([ 1., 1., 1., 1., 1.])
10. / 2 == 5
True
I'm using now 10 // 2, or int(10./2 + 1) but this is unconditional
and doesn't raise if the numbers are not close or equal to an integer
(which would be a bug)
Josef
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
j***@gmail.com
2015-07-01 15:07:43 UTC
Permalink
Post by Sebastian Berg
Post by j***@gmail.com
About the deprecation warning for using another type than integers, in
Wouldn't it be nicer to accept floats that are equal to an integer?
Hmmm, the biggest point was that the old solution was to basically
(besides strings) use `int(...)`, which means it does not raise any
errors as you also mention.
I am open to think about allowing exact floats for most of this
(frankly, not advanced indexing at least for the moment, but we never
did there), I think scipy may be doing that for some functions?
The disadvantage I see is, that some weirder calculations would possible
work most of the times, but not always, what I mean is such a case.
...: print i, i == i * 0.1 * 10
0 True
1 True
2 True
3 False
4 True
5 True
6 False
7 False
8 True
9 True
I am somewhat opposed to rounding a lot (i.e. not noticing if you got
3.3333 somewhere), so not sure if you can define a "tolerance"
reasonable here unless it is exact. Though I guess you are right that
`//` will also just round silently already.
Yes, I thought about this, something like `int_if_close` in analogy to
real_if_close would be useful.

However, given that we need to decide on a threshold in this case, I
thought it's overkill to put that into reshape, ones and indexing and
similar.

Simpler cases would work
number if triangular elements
Post by Sebastian Berg
Post by j***@gmail.com
for i in range(10): print(i, i * (i - 1) / 2. == int(i * (i - 1) / 2.))
0 True
1 True
2 True
3 True
4 True
5 True
6 True
7 True
8 True
9 True

also np.ceil and np.trunc return floats, not integers.

One disadvantage of raising or warning after the equality check is that
developers have a tendency to write "nice" unit tests. Then the casting
doesn't break in the unit tests but might raise an exception at some random
data.


For reference: here are my changes in cleaning up
https://github.com/statsmodels/statsmodels/pull/2490/files


Josef
Post by Sebastian Berg
- Sebastian
Post by j***@gmail.com
for example
5.0 == 5
True
np.ones(10 / 2)
array([ 1., 1., 1., 1., 1.])
10 / 2 == 5
True
or the python 2 version
np.ones(10. / 2)
array([ 1., 1., 1., 1., 1.])
10. / 2 == 5
True
I'm using now 10 // 2, or int(10./2 + 1) but this is unconditional
and doesn't raise if the numbers are not close or equal to an integer
(which would be a bug)
Josef
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Neal Becker
2015-07-02 12:40:13 UTC
Permalink
On Wed, Jul 1, 2015 at 10:32 AM, Sebastian Berg
Post by j***@gmail.com
About the deprecation warning for using another type than integers, in
Wouldn't it be nicer to accept floats that are equal to an integer?
I'd be concerned that checking each index for exactness would be costly.
I'm also concerned that using floats for an index is frequently a mistake
and that a warning is what I want.
Antoine Pitrou
2015-07-02 13:37:19 UTC
Permalink
On Thu, 02 Jul 2015 08:40:13 -0400
Post by Neal Becker
I'd be concerned that checking each index for exactness would be costly.
I'm also concerned that using floats for an index is frequently a mistake
and that a warning is what I want.
Or just follow Python:

Python 3.4.3 |Continuum Analytics, Inc.| (default, Jun 4 2015,
15:29:08) [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
Post by Neal Becker
[1][0.0]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: list indices must be integers, not float


I don't think relaxing type checking here makes any good.

Regards

Antoine.
Sturla Molden
2015-07-02 17:38:17 UTC
Permalink
Post by Antoine Pitrou
I don't think relaxing type checking here makes any good.
I agee. NumPy should do the same as Python in this case.


Sturla
Sebastian
2015-07-04 07:26:20 UTC
Permalink
Hi,
Post by Antoine Pitrou
File "<stdin>", line 1, in <module>
TypeError: list indices must be integers, not float
I don't think relaxing type checking here makes any good.
Python is also strong-typed which means that types are never converted
silently. I think a library should follow the behavior of the language.

https://wiki.python.org/moin/Why%20is%20Python%20a%20dynamic%20language%20and%20also%20a%20strongly%20typed%20language

Sebastian

- --
python programming - mail server - photo - video - https://sebix.at
To verify my cryptographic signature or send me encrypted mails, get my
key at https://sebix.at/DC9B463B.asc and on public keyservers.

Chris Barker - NOAA Federal
2015-07-03 00:51:01 UTC
Permalink
Sent from my iPhone
Post by Sebastian Berg
The disadvantage I see is, that some weirder calculations would possible
work most of the times, but not always,
not sure if you can define a "tolerance"
reasonable here unless it is exact.
You could use a relative tolerance, but you'd still have to set that.
Better to put that decision squarely in the user's hands.
Post by Sebastian Berg
Though I guess you are right that
`//` will also just round silently already.
Yes, but if it's in the user's code, it should be obvious -- and then
the user can choose to round, or floor, or ceiling....

-CHB
Post by Sebastian Berg
- Sebastian
Post by j***@gmail.com
for example
5.0 == 5
True
np.ones(10 / 2)
array([ 1., 1., 1., 1., 1.])
10 / 2 == 5
True
or the python 2 version
np.ones(10. / 2)
array([ 1., 1., 1., 1., 1.])
10. / 2 == 5
True
I'm using now 10 // 2, or int(10./2 + 1) but this is unconditional
and doesn't raise if the numbers are not close or equal to an integer
(which would be a bug)
Josef
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
j***@gmail.com
2015-07-03 01:18:57 UTC
Permalink
On Thu, Jul 2, 2015 at 8:51 PM, Chris Barker - NOAA Federal <
Post by Chris Barker - NOAA Federal
Sent from my iPhone
Post by Sebastian Berg
The disadvantage I see is, that some weirder calculations would possible
work most of the times, but not always,
not sure if you can define a "tolerance"
reasonable here unless it is exact.
You could use a relative tolerance, but you'd still have to set that.
Better to put that decision squarely in the user's hands.
Post by Sebastian Berg
Though I guess you are right that
`//` will also just round silently already.
Yes, but if it's in the user's code, it should be obvious -- and then
the user can choose to round, or floor, or ceiling....
round, floor, ceil don't produce integers.

I'm writing library code, and I don't have control over what everyone does.

round, floor, ceil, and // might hide bugs or user mistakes, if we are
supposed to get something that is "like an int" but it's. 42.6 instead.

Josef
https://en.wikipedia.org/wiki/Phrases_from_The_Hitchhiker%27s_Guide_to_the_Galaxy#Answer_to_the_Ultimate_Question_of_Life.2C_the_Universe.2C_and_Everything_.2842.29
Post by Chris Barker - NOAA Federal
-CHB
Post by Sebastian Berg
- Sebastian
Post by j***@gmail.com
for example
5.0 == 5
True
np.ones(10 / 2)
array([ 1., 1., 1., 1., 1.])
10 / 2 == 5
True
or the python 2 version
np.ones(10. / 2)
array([ 1., 1., 1., 1., 1.])
10. / 2 == 5
True
I'm using now 10 // 2, or int(10./2 + 1) but this is unconditional
and doesn't raise if the numbers are not close or equal to an integer
(which would be a bug)
Josef
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Jeff Reback
2015-07-03 01:28:11 UTC
Permalink
FYI pandas followed the same pattern to deprecate float indexers (except for indexing in a Float64Index) about a year ago

see here: http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#whatsnew-0140-deprecations
Post by j***@gmail.com
Post by Chris Barker - NOAA Federal
Sent from my iPhone
Post by Sebastian Berg
The disadvantage I see is, that some weirder calculations would possible
work most of the times, but not always,
not sure if you can define a "tolerance"
reasonable here unless it is exact.
You could use a relative tolerance, but you'd still have to set that.
Better to put that decision squarely in the user's hands.
Post by Sebastian Berg
Though I guess you are right that
`//` will also just round silently already.
Yes, but if it's in the user's code, it should be obvious -- and then
the user can choose to round, or floor, or ceiling....
round, floor, ceil don't produce integers.
I'm writing library code, and I don't have control over what everyone does.
round, floor, ceil, and // might hide bugs or user mistakes, if we are supposed to get something that is "like an int" but it's. 42.6 instead.
Josef
https://en.wikipedia.org/wiki/Phrases_from_The_Hitchhiker%27s_Guide_to_the_Galaxy#Answer_to_the_Ultimate_Question_of_Life.2C_the_Universe.2C_and_Everything_.2842.29
Post by Chris Barker - NOAA Federal
-CHB
Post by Sebastian Berg
- Sebastian
Post by j***@gmail.com
for example
5.0 == 5
True
np.ones(10 / 2)
array([ 1., 1., 1., 1., 1.])
10 / 2 == 5
True
or the python 2 version
np.ones(10. / 2)
array([ 1., 1., 1., 1., 1.])
10. / 2 == 5
True
I'm using now 10 // 2, or int(10./2 + 1) but this is unconditional
and doesn't raise if the numbers are not close or equal to an integer
(which would be a bug)
Josef
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Chris Barker
2015-07-04 05:07:36 UTC
Permalink
Post by j***@gmail.com
round, floor, ceil don't produce integers.
True -- in a dynamic language, they probably should, but that's legacy that
won't change.

It's annoying, but you do need to do:

int(round(want_it_to_be_an_index))

but as they say, explicite is better than implicit.
Post by j***@gmail.com
I'm writing library code, and I don't have control over what everyone does.
I'm confused -- what is the problem here -- if your library code required
an integer for an index, then that's what your users need to pass in -- how
they get that integer is under their control -- why would you want it
otherwise?

Or your code does the round|ceil|floor and int conversion -- but hen you
know what you're doing.

round, floor, ceil, and // might hide bugs or user mistakes, if we are
Post by j***@gmail.com
supposed to get something that is "like an int" but it's. 42.6 instead.
then it will raise an exception -- what's the problem?

but what should 42.000000000001 do? IT seems to me, there is no choice but
an exception -- or you are really going to hide bugs.

-Chris
--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

***@noaa.gov
Loading...