[Numpy-discussion] Changing the behavior of (builtins.)round (via the __round_

Discussion:

[Numpy-discussion] Changing the behavior of (builtins.)round (via the __round__ dunder) to return an integer

Antony Lee

2016-04-13 07:42:06 UTC

https://github.com/numpy/numpy/issues/3511 proposed (nearly three years
ago) to return an integer when `builtins.round` (which calls the `__round__
dunder method, and thereafter called `round` (... not to be confused with
`np.round`)) is called with a single argument. Currently, `round` returns
a floating scalar for numpy scalars, matching the Python2 behavior.

Python3 changed the behavior of `round` to return an int when it is called
with a single argument (otherwise, the return type matches the type of the
first argument). I believe this is more intuitive, and is arguably
becoming more important now that numpy is deprecating (via a
VisibleDeprecationWarning) indexing with a float: having to write

array[int(round(some_float))]

is rather awkward. (Note that I am suggesting to switch to the new
behavior regardless of the version of Python.)

Note that currently the `__round__` dunder is not implemented for arrays
(... see https://github.com/numpy/numpy/issues/6248) so it would be
feasible to always return a signed integer of the same size with an
OverflowError on overflow (at least, any floating point that is round-able
without loss of precision will be covered). If `__round__` ends up being
implemented for ndarrays too, I guess the correct behavior will be whatever
we come up for signaling failure in integer operations (see current
behavior of `np.array([0, 1]) // np.array([0, 1])`).

Also note the comment posted by @njsmith on the github issue thread:

I'd be fine with matching python here, but we need to run it by the mailing
list.

Not clear what the right kind of deprecation is... Normally FutureWarning
since there's no error involved, but that would both be very annoying
(basically makes round unusable -- you get this noisy warning even if what
you're doing is round(a).astype(int)), and the change is relatively low
risk compared to most FutureWarning changes, since the actual values
returned are identical before and after the change.

Thoughts?

Antony

Stephan Hoyer

2016-04-13 08:31:06 UTC

Permalink

(Note that I am suggesting to switch to the new behavior regardless of the
version of Python.)

I would lean towards making this change only for Python 3. This is arguably
more consistent with Python than changing the behavior on Python 2.7, too.

The most obvious way in which a float being surprisingly switched to an
integer could cause silent bugs (rather than noisy TypeErrors) is if the
number is used in division. True division in Python 3 eliminates this risk.

Generally, I agree with your reasoning. It would be unfortunate to be stuck
with this legacy behavior forever.

j***@gmail.com

2016-04-13 15:06:43 UTC

Permalink

Post by Stephan Hoyer

(Note that I am suggesting to switch to the new behavior regardless of
the version of Python.)

I would lean towards making this change only for Python 3. This is
arguably more consistent with Python than changing the behavior on Python
2.7, too.
The most obvious way in which a float being surprisingly switched to an
integer could cause silent bugs (rather than noisy TypeErrors) is if the
number is used in division. True division in Python 3 eliminates this risk.
Generally, I agree with your reasoning. It would be unfortunate to be
stuck with this legacy behavior forever.

The difference is that Python 3 has looooong ints, (and doesn't have to
overflow, AFAICS)

what happens with nan?
I guess inf would overflow?

(nan and inf are preserved with np.round)

Josef

Post by Stephan Hoyer
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Stephan Hoyer

2016-04-13 16:35:19 UTC

Permalink

Post by j***@gmail.com
The difference is that Python 3 has looooong ints, (and doesn't have to
overflow, AFAICS)

This is a good point. But if your float is so big that rounding it to an
integer would overflow int64, rounding is already a no-op. I'm sure this
has been done before but I would guess it's quite rare. I would be OK
raising in this situation, especially because np.around will still be
around returning floats.

Post by j***@gmail.com
what happens with nan?
I guess inf would overflow?

builtins.round raises for both of these (in Python 3) and I would propose
copying this behavior:

In [52]: round(float('inf'))
---------------------------------------------------------------------------
OverflowError Traceback (most recent call last)
<ipython-input-52-798e0e9243d6> in <module>()
----> 1 round(float('inf'))

OverflowError: cannot convert float infinity to integer

In [53]: round(float('nan'))
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-53-e989485df64c> in <module>()
----> 1 round(float('nan'))

ValueError: cannot convert float NaN to integer