Discussion:
[Numpy-discussion] Modulus (remainder) function corner cases
Charles R Harris
2016-02-13 16:31:42 UTC
Permalink
Hi All,

I'm curious as to what folks think about some choices in the compution of
the remainder function. As an example where different choices can be made

In [2]: -1e-64 % 1.
Out[2]: 1.0

In [3]: float64(-1e-64) % 1.
Out[3]: 0.99999999999999989

The first is Python, the second is in my branch. The first is more accurate
numerically, but the modulus is of the same magnitude as the divisor. The
second maintains the convention that the result must have smaller magnitude
than the divisor. There are other corner cases along the same lines. So the
question is, which is more desirable: maintaining numerical accuracy
or enforcing
mathematical convention? The differences are on the order of an ulp, but
there will be a skew in the distribution of the errors if convention is
maintained.

The Fortran modulo function, which is the same basic function as in my
branch, does not specify any bounds on the result for floating numbers, but
gives only the formula, modulus(a, b) = a - b*floor(a/b), which has the
advantage of being simple and well defined ;)

Chuck
Charles R Harris
2016-02-13 16:42:03 UTC
Permalink
Post by Charles R Harris
Hi All,
I'm curious as to what folks think about some choices in the compution of
the remainder function. As an example where different choices can be made
In [2]: -1e-64 % 1.
Out[2]: 1.0
In [3]: float64(-1e-64) % 1.
Out[3]: 0.99999999999999989
The first is Python, the second is in my branch. The first is more accurate
numerically, but the modulus is of the same magnitude as the divisor. The
second maintains the convention that the result must have smaller
magnitude than the divisor. There are other corner cases along the same
lines. So the question is, which is more desirable: maintaining numerical
accuracy or enforcing mathematical convention? The differences are on the
order of an ulp, but there will be a skew in the distribution of the
errors if convention is maintained.
The Fortran modulo function, which is the same basic function as in my
branch, does not specify any bounds on the result for floating numbers, but
gives only the formula, modulus(a, b) = a - b*floor(a/b), which has the
advantage of being simple and well defined ;)
Note that the other enforced bound is that the result have the same sign as
the divisor. Python enforces that by adjusting the integer part, I enforce
it by adjusting the remainder.

Chuck
Nils Becker
2016-02-14 19:35:27 UTC
Permalink
Post by Charles R Harris
The Fortran modulo function, which is the same basic function as in my
Post by Charles R Harris
branch, does not specify any bounds on the result for floating numbers, but
gives only the formula, modulus(a, b) = a - b*floor(a/b), which has the
advantage of being simple and well defined ;)
In the light of the libm-discussion I spent some time looking at floating
point functions and their accuracy. I would vote in favor of keeping an
implementation that uses the fmod-function of the system library and bends
it to adhere to the python convention (sign of divisor). There is probably
a reason why the fmod-implementation is not as simple as "a - b*floor(a/b)"
[1].

One obvious problem with the simple expression arises when a/b = 0.0 in
floating point. E.g.

In [43]: np.__version__
Out[43]: '1.10.4'
In [44]: x = np.float64(1e-320)
In [45]: y = np.float64(-1e10)
In [46]: x % y # this uses libm's fmod on my system
Out[46]: -10000000000.0 # == y, correctly rounded result in round-to-nearest
In [47]: x - y*np.floor(x/y) # this here is the naive expression
Out[47]: 9.9998886718268301e-321 # == x, wrong sign

There are other problems (a/b = inf in floating point). As I do not
understand the implementation of fmod (for example in openlibm) in detail I
cannot give a full list of corner cases.

Unfortunately, I did not follow the (many different) bug reports on this
issue originally and am confused why there was a need to change the
implementation in the first place. numpy's "%" operator seems to work quite
well on my system. Therefore, this mail may be rather unproductive as I am
missing some information.

Concerning your original question: Many elementary functions loose their
mathematical properties when they are calculated correctly-rounded in
floating point numbers [2]. We do not fix this for other functions, I would
not fix it here.

Cheers
Nils

[1] https://github.com/JuliaLang/openlibm/blob/master/src/e_fmod.c
[2] np.exp(np.nextafter(1.0, 0.0)) < np.e -> False (Monotonicity lost in
exp(x)).
Charles R Harris
2016-02-14 19:54:47 UTC
Permalink
Post by Nils Becker
Post by Charles R Harris
The Fortran modulo function, which is the same basic function as in my
Post by Charles R Harris
branch, does not specify any bounds on the result for floating numbers, but
gives only the formula, modulus(a, b) = a - b*floor(a/b), which has
the advantage of being simple and well defined ;)
In the light of the libm-discussion I spent some time looking at floating
point functions and their accuracy. I would vote in favor of keeping an
implementation that uses the fmod-function of the system library and bends
it to adhere to the python convention (sign of divisor). There is probably
a reason why the fmod-implementation is not as simple as "a - b*floor(a/b)"
[1].
One obvious problem with the simple expression arises when a/b = 0.0 in
floating point. E.g.
In [43]: np.__version__
Out[43]: '1.10.4'
In [44]: x = np.float64(1e-320)
In [45]: y = np.float64(-1e10)
In [46]: x % y # this uses libm's fmod on my system
Out[46]: -10000000000.0 # == y, correctly rounded result in
round-to-nearest
In [47]: x - y*np.floor(x/y) # this here is the naive expression
Out[47]: 9.9998886718268301e-321 # == x, wrong sign
But more accurate ;) Currently, this is actually clipped.

In [3]: remainder(x,y)
Out[3]: -0.0

In [4]: x - y*floor(x/y)
Out[4]: 9.9998886718268301e-321

In [5]: fmod(x,y)
Out[5]: 9.9998886718268301e-321
Post by Nils Becker
There are other problems (a/b = inf in floating point). As I do not
understand the implementation of fmod (for example in openlibm) in detail I
cannot give a full list of corner cases.
?

<snip>

Chuck
Charles R Harris
2016-02-14 20:11:39 UTC
Permalink
On Sun, Feb 14, 2016 at 12:54 PM, Charles R Harris <
Post by Nils Becker
Post by Charles R Harris
The Fortran modulo function, which is the same basic function as in my
Post by Charles R Harris
branch, does not specify any bounds on the result for floating
numbers, but gives only the formula, modulus(a, b) = a - b*floor(a/b),
which has the advantage of being simple and well defined ;)
In the light of the libm-discussion I spent some time looking at floating
point functions and their accuracy. I would vote in favor of keeping an
implementation that uses the fmod-function of the system library and bends
it to adhere to the python convention (sign of divisor). There is probably
a reason why the fmod-implementation is not as simple as "a - b*floor(a/b)"
[1].
One obvious problem with the simple expression arises when a/b = 0.0 in
floating point. E.g.
In [43]: np.__version__
Out[43]: '1.10.4'
In [44]: x = np.float64(1e-320)
In [45]: y = np.float64(-1e10)
In [46]: x % y # this uses libm's fmod on my system
I'm not too worried about denormals. However, this might be considered a
bug in the floor function

In [16]: floor(-1e-330)
Out[16]: -0.0

Chuck
Charles R Harris
2016-02-14 20:35:27 UTC
Permalink
Post by Charles R Harris
On Sun, Feb 14, 2016 at 12:54 PM, Charles R Harris <
Post by Nils Becker
Post by Charles R Harris
The Fortran modulo function, which is the same basic function as in my
Post by Charles R Harris
branch, does not specify any bounds on the result for floating
numbers, but gives only the formula, modulus(a, b) = a - b*floor(a/b),
which has the advantage of being simple and well defined ;)
In the light of the libm-discussion I spent some time looking at
floating point functions and their accuracy. I would vote in favor of
keeping an implementation that uses the fmod-function of the system library
and bends it to adhere to the python convention (sign of divisor). There is
probably a reason why the fmod-implementation is not as simple as "a -
b*floor(a/b)" [1].
One obvious problem with the simple expression arises when a/b = 0.0 in
floating point. E.g.
In [43]: np.__version__
Out[43]: '1.10.4'
In [44]: x = np.float64(1e-320)
In [45]: y = np.float64(-1e10)
In [46]: x % y # this uses libm's fmod on my system
I'm not too worried about denormals. However, this might be considered a
bug in the floor function
In [16]: floor(-1e-330)
Out[16]: -0.0
However, I do note that some languages offer two versions of modulus, one
floor based and the other trunc based (effectively fmod). What I wanted is
to keep the remainder consistent with the floor function in the C library.

Chuck

Loading...