Discussion:
[Numpy-discussion] binary wheels for numpy?
Chris Barker
10 years ago
Permalink
Hi folks.,

I did a little "intro to scipy" session as part of a larger Python class
the other day, and was dismayed to find that "pip install numpy" still
dosn't work on Windows.

Thanks mostly to Matthew Brett's work, the whole scipy stack is
pip-installable on OS-X, it would be really nice if we had that for Windows.

And no, saying "you should go get Python(x,y) or Anaconda, or Canopy,
or...) is really not a good solution. That is indeed the way to go if
someone is primarily focusing on computational programming, but if you have
a web developer, or someone new to Python for general use, they really
should be able to just grab numpy and play around with it a bit without
having to start all over again.


My solution was to point folks to Chris Gohlke's site -- which is a
Fabulous resource --

THANK YOU CHRISTOPH!

But I still think that we should have the basic scipy stack on PyPi as
Windows Wheels...

IIRC, the last run through on this discussion got stuck on the "what
hardware should it support" -- wheels do not allow a selection at install
time, so we'd have to decide what instruction set to support, and just
stick with that. Which would mean that:

some folks would get a numpy/scipy that would run a bit slower than it might
and
some folks would get one that wouldn't run at all on their machine.

But I don't see any reason that we can't find a compromise here -- do a
build that supports most machines, and be done with it. Even now, people
have to go get (one way or another) a MKL-based build to get optimum
performance anyway -- so if we pick an instruction set support by, say (an
arbitrary, and impossible to determine) 95% of machines out there -- we're
good to go.

I take it there are licensing issues that prevent us from putting Chris'
Binaries up on PyPi?

But are there technical issues I'm forgetting here, or do we just need to
come to a consensus as to hardware version to support and do it?

-Chris
--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

***@noaa.gov
Matthew Brett
10 years ago
Permalink
Hi,
...
Yes, unfortunately we can't put MKL binaries on pypi because of the
MKL license - see
https://github.com/numpy/numpy/wiki/Numerical-software-on-Windows#blas--lapack-libraries.
Also see discussion in the containing thread of
http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069701.html
.
Post by Chris Barker
But are there technical issues I'm forgetting here, or do we just need to
come to a consensus as to hardware version to support and do it?
There has been some progress on this - see

https://github.com/scipy/scipy/issues/4829

I think there's a move afoot to have a Google hangout or similar on
this exact topic :
https://github.com/scipy/scipy/issues/2829#issuecomment-101303078 -
maybe we could hammer out a policy there? Once we have got numpy and
scipy built in a reasonable way, I think we will be most of the way
there...

Cheers,

Matthew
Chris Barker - NOAA Federal
10 years ago
Permalink
Thanks for the update Matthew, it's great to see so much activity on this issue.

Looks like we are headed in the right direction --and getting close.

Thanks to all that are putting time into this.

-Chris
...
Ralf Gommers
10 years ago
Permalink
...
There's the switch to OpenBLAS and building the right selection mechanism
for which arch to use:
http://article.gmane.org/gmane.comp.python.distutils.devel/20350. That
seems now feasible to complete on a reasonable time-scale, and the problems
with OpenBLAS seem to be mostly solved. Binaries which crash for ~1% of
users (which ATLAS-SSE2 would result in) are still not acceptable I think.

Ralf
...
Chris Barker
10 years ago
Permalink
Binaries which crash for ~1% of users (which ATLAS-SSE2 would result in)
are still not acceptable I think.
what instruction set would an OpenBLAS build support? wouldn't we still
need to select a lowest common denominator instructions set to support?

And SEE2 was introduced with the Pentium 4in 2001 -- that is a very long
time ago!

I think the 1% number came from a survey of firefox downloads -- that may
well not be representative of the numpy-using population.

and depending on HOW it failed, 1% might be OK if we could give a
reasonable error message (which maybe we can't...)

-Chris
--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

***@noaa.gov
Matthew Brett
10 years ago
Permalink
Binaries which crash for ~1% of users (which ATLAS-SSE2 would result in)
are still not acceptable I think.
what instruction set would an OpenBLAS build support? wouldn't we still need
to select a lowest common denominator instructions set to support?
I believe OpenBLAS does run-time selection too.
And SEE2 was introduced with the Pentium 4in 2001 -- that is a very long
time ago!
I think the 1% number came from a survey of firefox downloads -- that may
well not be representative of the numpy-using population.
and depending on HOW it failed, 1% might be OK if we could give a reasonable
error message (which maybe we can't...)
I think we discussed before having a check and error clause in
__init__.py saying something like "You have a really old computer, you
can't use this binary, please go to sourceforge and download the exe
installer...".

Matthew
Chris Barker
10 years ago
Permalink
Post by Matthew Brett
I believe OpenBLAS does run-time selection too.
very cool! then an excellent option if we can get it to work (make that you
can get it to work, I'm not doing squat in this effort other than
nudging...)

I think we discussed before having a check and error clause in
Post by Matthew Brett
__init__.py saying something like "You have a really old computer, you
can't use this binary, please go to sourceforge and download the exe
installer...".
If we can to that, then there is NO reason not to put up binaries that
_may_ not support some tiny percentage of users.

though maybe with OpenBLAS we don't need to anyway.

Thanks again to y'all for working on this.

-Chris
--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

***@noaa.gov
Sturla Molden
10 years ago
Permalink
Post by Matthew Brett
I believe OpenBLAS does run-time selection too.
very cool! then an excellent option if we can get it to work (make that
you can get it to work, I'm not doing squat in this effort other than
nudging...)
Carl Kleffner has built binary wheels for NumPy and SciPy with OpenBLAS
configured for run-time hardware detection. I don't remember at the top
of my head where you can download them for testing. IIRC there remaining
test failures were not related to OpenBLAS.

Sturla
Carl Kleffner
10 years ago
Permalink
numpy and scipy wheels for python2.6-3.4 have been uploaded on binstar last
month and are installable with pip:

https://binstar.org/carlkl/numpy
https://binstar.org/carlkl/scipy

The toolchains can be downloaded from
https://bitbucket.org/carlkl/mingw-w64-for-python/downloads with some
explanations given in
https://bitbucket.org/carlkl/mingw-w64-for-python/downloads/mingwpy-2015-04-readme.pdf

Carl
...
Nathaniel Smith
10 years ago
Permalink
Post by Ralf Gommers
There's the switch to OpenBLAS and building the right selection mechanism
http://article.gmane.org/gmane.comp.python.distutils.devel/20350. That seems
now feasible to complete on a reasonable time-scale, and the problems with
OpenBLAS seem to be mostly solved. Binaries which crash for ~1% of users
(which ATLAS-SSE2 would result in) are still not acceptable I think.
Where are you getting this SSE2 number from btw? The most detailed
public survey source for consumer hardware that I know is the Steam
hardware survey:

http://store.steampowered.com/hwsurvey

It's somewhat biased towards higher-end hardware b/c it targets
gamers, but there is plenty of less-high-end hardware on there as well
-- notice that 20% of the surveyed computers are using intel graphics.
And they're reporting that 99.92% of surveyed computers have SSE*3*
support, and 100.00% have SSE2. So assuming the significant digits are
accurate, this puts the upper bound on SSE2 failure on these systems
at ~0.05%. Even if gamers are 10x likelier to have new hardware then
the rest of the world, 1% still seems to be at least an order of
magnitude too high?

-n
--
Nathaniel J. Smith -- http://vorpus.org
Ralf Gommers
10 years ago
Permalink
Post by Nathaniel Smith
Post by Ralf Gommers
There's the switch to OpenBLAS and building the right selection mechanism
http://article.gmane.org/gmane.comp.python.distutils.devel/20350. That
seems
Post by Ralf Gommers
now feasible to complete on a reasonable time-scale, and the problems
with
Post by Ralf Gommers
OpenBLAS seem to be mostly solved. Binaries which crash for ~1% of users
(which ATLAS-SSE2 would result in) are still not acceptable I think.
Where are you getting this SSE2 number from btw?
This is info Matthew just collected from Firefox crash reports:
https://github.com/scipy/scipy/issues/4829#issuecomment-100354752

The most detailed
...
That would make life easier.....

Ralf
Nathaniel Smith
10 years ago
Permalink
...
Ah, hmm. I guess it's possible that decade-old machines are less
reliable and overrepresented in crash reports, but who knows :-)

It might become reasonable at some point to just go ahead and put up
binaries (ideally with some check so that they fail in a
human-readable way), and see how many people email us. If it's too
many we can always take the wheels down again.

-n
--
Nathaniel J. Smith -- http://vorpus.org
Ralf Gommers
10 years ago
Permalink
...
We should probably do that for the next release, if and only if we cannot
make the switch to OpenBLAS in time.

Ralf
Sturla Molden
10 years ago
Permalink
Post by Matthew Brett
Yes, unfortunately we can't put MKL binaries on pypi because of the
MKL license - see
I believe we can, because we asked Intel for permission. From what I heard
the response was positive.

But it doesn't mean we should. :-)

Sturla
Matthew Brett
10 years ago
Permalink
Post by Sturla Molden
Post by Matthew Brett
Yes, unfortunately we can't put MKL binaries on pypi because of the
MKL license - see
I believe we can, because we asked Intel for permission. From what I heard
the response was positive.
We would need something formal from Intel saying that they do not
require us to hold our users to their standard redistribution terms
and that they waive the requirement that we be responsible for any
damage to Intel that happens as a result of people using our binaries.

I'm guessing we don't have this, but I'm happy to be corrected,

Cheers,

Matthew
Ralf Gommers
10 years ago
Permalink
...
We only have an email, probably not enough. I'd rather not go to the
trouble of discussing something more formal unless we are really sure that
we actually want to distribute MKL binaries. Which isn't too likely I
suspect; OpenBLAS seems like the way to go (?).

Ralf
Sturla Molden
10 years ago
Permalink
I suspect; OpenBLAS seems like the way to go (?).
I think OpenBLAS is currently the most promising candidate to replace
ATLAS. But we need to build OpenBLAS with MinGW gcc, due to AT&T syntax
in the assembly code. I am not sure if the old toolchain is good enough,
or if we will need Carl Kleffner's binaries.

Sturla
Nathaniel Smith
10 years ago
Permalink
Post by Sturla Molden
I suspect; OpenBLAS seems like the way to go (?).
I think OpenBLAS is currently the most promising candidate to replace
ATLAS. But we need to build OpenBLAS with MinGW gcc, due to AT&T syntax
in the assembly code. I am not sure if the old toolchain is good enough,
or if we will need Carl Kleffner's binaries.
The old toolchain is 32-bit only, so it certainly won't be a general solution.
--
Nathaniel J. Smith -- http://vorpus.org
Robert Kern
10 years ago
Permalink
...
I don't think permission from Intel is the blocking issue for putting these
binaries up on PyPI. Even with Intel's permission, we would be putting up
proprietary binaries on a page that is explicitly claiming that the files
linked therein are BSD-licensed. The binaries could not be redistributed
with any GPLed module, say, pygsl.

We could host them on numpy.org on their own page that clearly explained
the license of those files, but I think PyPI is out.

--
Robert Kern
Chris Barker
10 years ago
Permalink
Post by Robert Kern
I don't think permission from Intel is the blocking issue for putting
these binaries up on PyPI. Even with Intel's permission, we would be
putting up proprietary binaries on a page that is explicitly claiming that
the files linked therein are BSD-licensed. The binaries could not be
redistributed with any GPLed module, say, pygsl.
We could host them on numpy.org on their own page that clearly explained
the license of those files, but I think PyPI is out.
Can't PyPi re-direct -- so they can actualy be hosted somewhere else, but
"pip install numpy" would still work?

IIUC, The Intel libs have the great advantage of run-time selection of
hardware specific code -- yes? So they would both work and give high
performance on most machines (all?). Much as I am a fan of open source,
there doesn't appear to be anything as good out there.

-CHB
--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

***@noaa.gov
Nathaniel Smith
10 years ago
Permalink
...
There's two issues here: (1) we can't actually use the intel stuff
(MKL, icc) under its regular license without having our release
managers accepting personal liability. Which isn't going to happen.
(2) The problem isn't whether they're hosted on PyPI, it's whether the
people downloading them get warned about what they're downloading. The
whole point is that we *don't* want 'pip install numpy' to work in
this case, because it's too seamless.

-n
--
Nathaniel J. Smith -- http://vorpus.org
Matthew Brett
10 years ago
Permalink
...
I'd add Robert's point - we will have made the default install
something that is not compatible with GPL libraries,

Matthew
Albert-Jan Roskam
10 years ago
Permalink
----- Original Message -----
...
But you could use allow-external or allow-all-external:


--allow-external <package>

Allow the installation of a package even if it is externally hosted

--allow-all-external

Allow the installation of all packages that are externally hosted

https://pip.pypa.io/en/latest/reference/pip_wheel.html#allow-external
Sturla Molden
10 years ago
Permalink
Post by Chris Barker
IIUC, The Intel libs have the great advantage of run-time selection of
hardware specific code -- yes? So they would both work and give high
performance on most machines (all?).
OpenBLAS can also be built for dynamic architecture with hardware
auto-detection. IIRC you build with DYNAMIC_ARCH=1 instead of specifying
TARGET.

Apple Accelerate Framework does this as well.


Sturla
j***@gmail.com
10 years ago
Permalink
...
Unrelated to the pip/wheel discussion.

In my experience by far the easiest to get something running to play with
is using Winpython. Download and unzip (and maybe add to system path) and
most of the data analysis stack is available.

I haven't even bothered yet to properly install a full "system python" on
my Windows machine. I'm just working with 3 winpython. (One even has Julia
and IJulia included after following the installation instructions for a
short time.)

Josef
...
Jaime Fernández del Río
10 years ago
Permalink
...
+1 on WinPython. I have half a dozen "installations" of it, none registered
with Windows.

Jaime
--
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
Chris Barker
10 years ago
Permalink
Post by j***@gmail.com
Unrelated to the pip/wheel discussion.
In my experience by far the easiest to get something running to play with
is using Winpython. Download and unzip (and maybe add to system path) and
most of the data analysis stack is available.
Sure -- if someone comes to me wanting to use python for
scientific/computational computing, I point them to one of the
distributions -- maybe I'll add WinPython to that list now.

But if someone is already using python for, say web development, then they
already have an installation up and running, and I want to give them an
easy option to add numpy (and secondarily scipy) to what they have easily.

And it looks like we are almost there, thanks to a lot of work by a few key
folks -- thanks!

-Chris
--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

***@noaa.gov
Loading...