Discussion:
[Numpy-discussion] float16/32: wrong number of digits?
Nico Schlömer
2017-03-09 10:26:44 UTC
Permalink
Hi everyone,

I wondered how to express a numpy float exactly in terms of format, and
found `%r` quite useful: `float(repr(a)) == a` is guaranteed for Python
`float`s. When trying the same thing with lower-precision Python floats, I
found this identity not quite fulfilled:
```
import numpy
b = numpy.array([1.0 / 3.0], dtype=np.float16)
float(repr(b[0])) - b[0]
Out[12]: -1.9531250000093259e-06
```
Indeed,
```
b
Out[6]: array([ 0.33325195], dtype=float16)
```
```
repr(b[0])
Out[7]: '0.33325'
```
When counting the bits, a float16 should hold 4.8 decimal digits, so
`repr()` seems right. Where does the garbage tail -1.9531250000093259e-06
come from though?

Cheers,
Nico
Anne Archibald
2017-03-09 10:57:19 UTC
Permalink
Post by Nico Schlömer
Hi everyone,
I wondered how to express a numpy float exactly in terms of format, and
found `%r` quite useful: `float(repr(a)) == a` is guaranteed for Python
`float`s. When trying the same thing with lower-precision Python floats, I
```
import numpy
b = numpy.array([1.0 / 3.0], dtype=np.float16)
float(repr(b[0])) - b[0]
Out[12]: -1.9531250000093259e-06
```
Indeed,
```
b
Out[6]: array([ 0.33325195], dtype=float16)
```
```
repr(b[0])
Out[7]: '0.33325'
```
When counting the bits, a float16 should hold 4.8 decimal digits, so
`repr()` seems right. Where does the garbage tail -1.9531250000093259e-06
come from though?
Even more troubling, the high precision numpy types - np.longdouble and its
complex version - lose intimation when used with repr.

The basic problem is (roughly) that all floating-point numbers are
converted to python floats before printing. I put some effort into cleaning
this up, but the code is messy (actually there are several independent code
paths for converting numbers to strings) and the algorithms python uses to
make repr work out nicely are nontrivial.

Anne
Post by Nico Schlömer
Cheers,
Nico
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Eric Wieser
2017-03-13 11:57:48 UTC
Permalink
`float(repr(a)) == a` is guaranteed for Python `float`
And `np.float16(repr(a)) == a` is guaranteed for `np.float16`(and the same
is true up to `float128`, which can be platform-dependent). Your code
doesn't work because you're deserializing to a higher precision format than
you serialized to.





--
View this message in context: http://numpy-discussion.10968.n7.nabble.com/float16-32-wrong-number-of-digits-tp44037p44046.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.
Anne Archibald
2017-03-13 14:18:40 UTC
Permalink
Post by Eric Wieser
`float(repr(a)) == a` is guaranteed for Python `float`
And `np.float16(repr(a)) == a` is guaranteed for `np.float16`(and the same
is true up to `float128`, which can be platform-dependent). Your code
doesn't work because you're deserializing to a higher precision format than
you serialized to.
I would hesitate to make this guarantee - certainly for old versions of
numpy, np.float128(repr(x))!=x in many cases. I submitted a patch, now
accepted, that probably accomplishes this on most systems (in fact this is
now in the test suite) but if you are using a version of numpy that is a
couple of years old, there is no way to convert long doubles to
human-readable or back that doesn't lose precision.

To repeat: only in recent versions of numpy can long doubles be converted
to human-readable and back without passing through doubles. It is still not
possible to use % or format() on them without discarding all precision
beyond doubles. If you actually need long doubles (and if you don't, why
use them?) make sure your application includes a test for this ability. I
recommend checking repr(1+np.finfo(np.longdouble).eps).

Anne

P.S. You can write (I have) a short piece of cython code that will reliably
repr and back long doubles, but on old versions of numpy it's just not
possible from within python. -A

Loading...