Matthew Brett
2016-03-04 04:42:42 UTC
Hi,
Summary:
I propose that we upload Windows wheels to pypi. The wheels are
likely to be stable and relatively easy to maintain, but will have
slower performance than other versions of numpy linked against faster
BLAS / LAPACK libraries.
Background:
There's a long discussion going on at issue github #5479 [1], where
the old problem of Windows wheels for numpy came up.
For those of you not following this issue, the current situation for
community-built numpy Windows binaries is dire:
* We have not so far provided windows wheels on pypi, so `pip install
numpy` on Windows will bring you a world of pain;
* Until recently we did provide .exe "superpack" installers on
sourceforge, but these became increasingly difficult to build and we
gave up building them as of the latest (1.10.4) release.
Despite this, popularity of Windows wheels on pypi is high. A few
weeks ago, Donald Stufft ran a query for the binary wheels most often
downloaded from pypi, for any platform [2] . The top five most
downloaded were (n_downloads, name):
6646, numpy-1.10.4-cp27-none-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl
5445, cryptography-1.2.1-cp27-none-win_amd64.whl
5243, matplotlib-1.4.0-cp34-none-win32.whl
5241, scikit_learn-0.15.1-cp34-none-win32.whl
4573, pandas-0.17.1-cp27-none-win_amd64.whl
So a) the OSX numpy wheel is very popular and b) despite the fact that
we don't provide a numpy wheel for Windows, matplotlib, sckit_learn
and pandas, that depend on numpy, are the 3rd, 4th and 5th most
downloaded wheels as of a few weeks ago.
So, there seems to be a large appetite for numpy wheels.
Current proposal:
I have now built numpy wheels, using the ATLAS blas / lapack library -
the build is automatic and reproducible [3].
I chose ATLAS to build against, rather than, say OpenBLAS, because
we've had some significant worries in the past about the reliability
of OpenBLAS, and I thought it better to err on the side of
correctness.
However, these builds are relatively slow for matrix multiply and
other linear algebra routines compared numpy built against OpenBLAS or
MKL (which we cannot use because of its license) [4]. In my very
crude array test of a dot product and matrix inversion, the ATLAS
wheels were 2-3 times slower than MKL. Other benchmarks on Julia
found about the same result for ATLAS vs OpenBLAS on 32-bit bit, but a
much bigger difference on 64-bit (for an earlier version of ATLAS than
we are currently using) [5].
So, our numpy wheels likely to be stable and give correct results, but
will be somewhat slow for linear algebra.
I propose that we upload these ATLAS wheels to pypi. The upside is
that this gives our Windows users a much better experience with pip,
and allows other developers to build Windows wheels that depend on
numpy. The downside is that these will not be optimized for
performance on modern processors. In order to signal that, I propose
adding the following text to the numpy pypi front page:
```
All numpy wheels distributed from pypi are BSD licensed.
Windows wheels are linked against the ATLAS BLAS / LAPACK library,
restricted to SSE2 instructions, so may not give optimal linear
algebra performance for your machine. See
http://docs.scipy.org/doc/numpy/user/install.html for alternatives.
```
In a way this is very similar to our previous situation, in that the
superpack installers also used ATLAS - in fact an older version of
ATLAS.
Once we are up and running with numpy wheels, we can consider whether
we should switch to other BLAS libraries, such as OpenBLAS or BLIS
(see [6]).
I'm posting here hoping for your feedback...
Cheers,
Matthew
[1] https://github.com/numpy/numpy/issues/5479
[2] https://gist.github.com/dstufft/1dda9a9f87ee7121e0ee
[3] https://ci.appveyor.com/project/matthew-brett/np-wheel-builder
[4] http://mingwpy.github.io/blas_lapack.html#intel-math-kernel-library
[5] https://github.com/numpy/numpy/issues/5479#issuecomment-185033668
[6] https://github.com/numpy/numpy/issues/7372
Summary:
I propose that we upload Windows wheels to pypi. The wheels are
likely to be stable and relatively easy to maintain, but will have
slower performance than other versions of numpy linked against faster
BLAS / LAPACK libraries.
Background:
There's a long discussion going on at issue github #5479 [1], where
the old problem of Windows wheels for numpy came up.
For those of you not following this issue, the current situation for
community-built numpy Windows binaries is dire:
* We have not so far provided windows wheels on pypi, so `pip install
numpy` on Windows will bring you a world of pain;
* Until recently we did provide .exe "superpack" installers on
sourceforge, but these became increasingly difficult to build and we
gave up building them as of the latest (1.10.4) release.
Despite this, popularity of Windows wheels on pypi is high. A few
weeks ago, Donald Stufft ran a query for the binary wheels most often
downloaded from pypi, for any platform [2] . The top five most
downloaded were (n_downloads, name):
6646, numpy-1.10.4-cp27-none-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl
5445, cryptography-1.2.1-cp27-none-win_amd64.whl
5243, matplotlib-1.4.0-cp34-none-win32.whl
5241, scikit_learn-0.15.1-cp34-none-win32.whl
4573, pandas-0.17.1-cp27-none-win_amd64.whl
So a) the OSX numpy wheel is very popular and b) despite the fact that
we don't provide a numpy wheel for Windows, matplotlib, sckit_learn
and pandas, that depend on numpy, are the 3rd, 4th and 5th most
downloaded wheels as of a few weeks ago.
So, there seems to be a large appetite for numpy wheels.
Current proposal:
I have now built numpy wheels, using the ATLAS blas / lapack library -
the build is automatic and reproducible [3].
I chose ATLAS to build against, rather than, say OpenBLAS, because
we've had some significant worries in the past about the reliability
of OpenBLAS, and I thought it better to err on the side of
correctness.
However, these builds are relatively slow for matrix multiply and
other linear algebra routines compared numpy built against OpenBLAS or
MKL (which we cannot use because of its license) [4]. In my very
crude array test of a dot product and matrix inversion, the ATLAS
wheels were 2-3 times slower than MKL. Other benchmarks on Julia
found about the same result for ATLAS vs OpenBLAS on 32-bit bit, but a
much bigger difference on 64-bit (for an earlier version of ATLAS than
we are currently using) [5].
So, our numpy wheels likely to be stable and give correct results, but
will be somewhat slow for linear algebra.
I propose that we upload these ATLAS wheels to pypi. The upside is
that this gives our Windows users a much better experience with pip,
and allows other developers to build Windows wheels that depend on
numpy. The downside is that these will not be optimized for
performance on modern processors. In order to signal that, I propose
adding the following text to the numpy pypi front page:
```
All numpy wheels distributed from pypi are BSD licensed.
Windows wheels are linked against the ATLAS BLAS / LAPACK library,
restricted to SSE2 instructions, so may not give optimal linear
algebra performance for your machine. See
http://docs.scipy.org/doc/numpy/user/install.html for alternatives.
```
In a way this is very similar to our previous situation, in that the
superpack installers also used ATLAS - in fact an older version of
ATLAS.
Once we are up and running with numpy wheels, we can consider whether
we should switch to other BLAS libraries, such as OpenBLAS or BLIS
(see [6]).
I'm posting here hoping for your feedback...
Cheers,
Matthew
[1] https://github.com/numpy/numpy/issues/5479
[2] https://gist.github.com/dstufft/1dda9a9f87ee7121e0ee
[3] https://ci.appveyor.com/project/matthew-brett/np-wheel-builder
[4] http://mingwpy.github.io/blas_lapack.html#intel-math-kernel-library
[5] https://github.com/numpy/numpy/issues/5479#issuecomment-185033668
[6] https://github.com/numpy/numpy/issues/7372