Thomas Baruchel
2015-12-11 13:25:33 UTC
From time to time it is asked on forums how to extend precision of computation on Numpy array. The most common answer
given to this question is: use the dtype=object with some arbitrary precision module like mpmath or gmpy.
See http://stackoverflow.com/questions/6876377/numpy-arbitrary-precision-linear-algebra or http://stackoverflow.com/questions/21165745/precision-loss-numpy-mpmath or http://stackoverflow.com/questions/15307589/numpy-array-with-mpz-mpfr-values
While this is obviously the most relevant answer for many users because it will allow them to use Numpy arrays exactly
as they would have used them with native types, the wrong thing is that from some point of view "true" vectorization
will be lost.
With years I got very familiar with the extended double-double type which has (for usual architectures) about 32 accurate
digits with faster arithmetic than "arbitrary precision types". I even used it for research purpose in number theory and
I got convinced that it is a very wonderful type as long as such precision is suitable.
I often implemented it partially under Numpy, most of the time by trying to vectorize at a low-level the libqd library.
But I recently thought that a very nice and portable way of implementing it under Numpy would be to use the existing layer
of vectorization on floats for computing the arithmetic operations by "columns containing half of the numbers" rather than
by "full numbers". As a proof of concept I wrote the following file: https://gist.github.com/baruchel/c86ed748939534d8910d
I converted and vectorized the Algol 60 codes from http://szmoore.net/ipdf/documents/references/dekker1971afloating.pdf
(Dekker, 1971).
A test is provided at the end; for inverting 100,000 numbers, my type is about 3 or 4 times faster than GMPY and almost
50 times faster than MPmath. It should be even faster for some other operations since I had to create another np.ones
array for testing this type because inversion isn't implemented here (which could of course be done). You can run this file by yourself
(maybe you will have to discard mpmath or gmpy if you don't have it).
I would like to discuss about the way to make available something related to that.
a) Would it be relevant to include that in Numpy ? (I would think to some "contribution"-tool rather than including it in
the core of Numpy because it would be painful to code all ufuncs; on the other hand I am pretty sure that many would be happy
to perform several arithmetic operations by knowing that they can't use cos/sin/etc. on this type; in other words, I am not
sure it would be a good idea to embed it as an every-day type but I think it would be nice to have it quickly available
in some way). If you agree with that, in which way should I code it (the current link only is a "proof of concept"; I would
be very happy to code it in some cleaner way)?
b) Do you think such attempt should remain something external to Numpy itself and be released on my Github account without being
integrated to Numpy?
Best regards,
given to this question is: use the dtype=object with some arbitrary precision module like mpmath or gmpy.
See http://stackoverflow.com/questions/6876377/numpy-arbitrary-precision-linear-algebra or http://stackoverflow.com/questions/21165745/precision-loss-numpy-mpmath or http://stackoverflow.com/questions/15307589/numpy-array-with-mpz-mpfr-values
While this is obviously the most relevant answer for many users because it will allow them to use Numpy arrays exactly
as they would have used them with native types, the wrong thing is that from some point of view "true" vectorization
will be lost.
With years I got very familiar with the extended double-double type which has (for usual architectures) about 32 accurate
digits with faster arithmetic than "arbitrary precision types". I even used it for research purpose in number theory and
I got convinced that it is a very wonderful type as long as such precision is suitable.
I often implemented it partially under Numpy, most of the time by trying to vectorize at a low-level the libqd library.
But I recently thought that a very nice and portable way of implementing it under Numpy would be to use the existing layer
of vectorization on floats for computing the arithmetic operations by "columns containing half of the numbers" rather than
by "full numbers". As a proof of concept I wrote the following file: https://gist.github.com/baruchel/c86ed748939534d8910d
I converted and vectorized the Algol 60 codes from http://szmoore.net/ipdf/documents/references/dekker1971afloating.pdf
(Dekker, 1971).
A test is provided at the end; for inverting 100,000 numbers, my type is about 3 or 4 times faster than GMPY and almost
50 times faster than MPmath. It should be even faster for some other operations since I had to create another np.ones
array for testing this type because inversion isn't implemented here (which could of course be done). You can run this file by yourself
(maybe you will have to discard mpmath or gmpy if you don't have it).
I would like to discuss about the way to make available something related to that.
a) Would it be relevant to include that in Numpy ? (I would think to some "contribution"-tool rather than including it in
the core of Numpy because it would be painful to code all ufuncs; on the other hand I am pretty sure that many would be happy
to perform several arithmetic operations by knowing that they can't use cos/sin/etc. on this type; in other words, I am not
sure it would be a good idea to embed it as an every-day type but I think it would be nice to have it quickly available
in some way). If you agree with that, in which way should I code it (the current link only is a "proof of concept"; I would
be very happy to code it in some cleaner way)?
b) Do you think such attempt should remain something external to Numpy itself and be released on my Github account without being
integrated to Numpy?
Best regards,
--
Thomas Baruchel
Thomas Baruchel