Discussion:
Default builds of OpenBLAS development branch are now fork safe
(too old to reply)
Nathaniel Smith
2014-02-20 00:25:40 UTC
Permalink
Hey all,

Just a heads up: thanks to the tireless work of Olivier Grisel, the
OpenBLAS development branch is now fork-safe when built with its default
threading support. (It is still not thread-safe when built using OMP for
threading and gcc, but this is not the default.)

Gory details: https://github.com/xianyi/OpenBLAS/issues/294

Check it out - if it works you might want to consider lobbying your
favorite distro to backport it.

-n
Julian Taylor
2014-02-20 10:32:01 UTC
Permalink
Post by Nathaniel Smith
Hey all,
Just a heads up: thanks to the tireless work of Olivier Grisel, the OpenBLAS
development branch is now fork-safe when built with its default threading
support. (It is still not thread-safe when built using OMP for threading and
gcc, but this is not the default.)
Gory details: https://github.com/xianyi/OpenBLAS/issues/294
Check it out - if it works you might want to consider lobbying your favorite
distro to backport it.
debian unstable and the upcoming ubuntu 14.04 are already fixed.
Olivier Grisel
2014-02-20 12:09:06 UTC
Permalink
Post by Julian Taylor
Post by Nathaniel Smith
Hey all,
Just a heads up: thanks to the tireless work of Olivier Grisel, the OpenBLAS
development branch is now fork-safe when built with its default threading
support. (It is still not thread-safe when built using OMP for threading and
gcc, but this is not the default.)
Gory details: https://github.com/xianyi/OpenBLAS/issues/294
Check it out - if it works you might want to consider lobbying your favorite
distro to backport it.
debian unstable and the upcoming ubuntu 14.04 are already fixed.
Nice!
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel
Sturla Molden
2014-02-20 13:28:52 UTC
Permalink
Will this mean NumPy, SciPy et al. can start using OpenBLAS in the
"official" binary packages, e.g. on Windows and Mac OS X? ATLAS is slow and
Accelerate conflicts with fork as well.

Will dotblas be built against OpenBLAS? AFAIK, it is only buit against
ATLAS or MKL, not any other BLAS, but it should just be a matter of
changing the build/link process.

Sturla
Post by Nathaniel Smith
Hey all,
Just a heads up: thanks to the tireless work of Olivier Grisel, the
OpenBLAS development branch is now fork-safe when built with its default
threading support. (It is still not thread-safe when built using OMP for
threading and gcc, but this is not the default.)
Gory details: <a
href="https://github.com/xianyi/OpenBLAS/issues/294">https://github.com/xianyi/OpenBLAS/issues/294</a>
Check it out - if it works you might want to consider lobbying your
favorite distro to backport it.
-n
_______________________________________________ NumPy-Discussion mailing list
href="http://mail.scipy.org/mailman/listinfo/numpy-discussion">http://mail.scipy.org/mailman/listinfo/numpy-discussion</a>
Olivier Grisel
2014-02-20 14:40:12 UTC
Permalink
Post by Sturla Molden
Will this mean NumPy, SciPy et al. can start using OpenBLAS in the
"official" binary packages, e.g. on Windows and Mac OS X? ATLAS is slow and
Accelerate conflicts with fork as well.
This what I would like to do personnally. Ideally as a distribution of
wheel packages

To do so I built the current develop branch of OpenBLAS with:

make USE_OPENMP=0 NUM_THREAD=32 NO_AFFINITY=1
make PREFIX=/opt/OpenBLAS-noomp install

Then I added a site.cfg file in the numpy source folder with the lines:

[openblas]
libraries = openblas
library_dirs = /opt/OpenBLAS-noomp/lib
include_dirs = /opt/OpenBLAS-noomp/include
Post by Sturla Molden
Will dotblas be built against OpenBLAS?
Yes:

$ ldd numpy/core/_dotblas.so
linux-vdso.so.1 => (0x00007fff24d04000)
libopenblas.so.0 => /opt/OpenBLAS-noomp/lib/libopenblas.so.0
(0x00007f432882f000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f4328449000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f432814c000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0
(0x00007f4327f2f000)
libgfortran.so.3 => /usr/lib/x86_64-linux-gnu/libgfortran.so.3
(0x00007f4327c18000)
/lib64/ld-linux-x86-64.so.2 (0x00007f43298d3000)
libquadmath.so.0 => /usr/lib/x86_64-linux-gnu/libquadmath.so.0
(0x00007f43279e1000)


However when testing this I noticed the following strange slow import
Post by Sturla Molden
import os, psutil
psutil.Process(os.getpid()).get_memory_info().rss / 1e6
20.324352
Post by Sturla Molden
%time import numpy
CPU times: user 1.95 s, sys: 1.3 s, total: 3.25 s
Wall time: 530 ms
Post by Sturla Molden
psutil.Process(os.getpid()).get_memory_info().rss / 1e6
349.507584

The libopenblas.so file is just 14MB so I don't understand how I could
get those 330MB from.

It's even worst when using static linking (libopenblas.a instead of
libopenblas.so under linux).

With Atlas or MKL I get import times under 50 ms and the memory
overhead of the numpy import is just ~15MB.

I would be very interested in any help on this:

- can you reproduce this behavior?
- do you have an idea of a possible cause?
- how to investigate?
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel
Olivier Grisel
2014-02-20 14:42:36 UTC
Permalink
FYI: to build scipy against OpenBLAS I used the following site.cfg at
the root of my scipy source folder:

[DEFAULT]
library_dirs = /opt/OpenBLAS-noomp/lib:/usr/local/lib
include_dirs = /opt/OpenBLAS-noomp/include:/usr/local/include

[blas_opt]
libraries = openblas

[lapack_opt]
libraries = openblas

But this is unrelated to the previous numpy memory pattern as it
occurs independendly of scipy.
--
Olivier
Carl Kleffner
2014-02-20 14:43:06 UTC
Permalink
Hi,

some days ago I put some preliminary mingw-w64 binaries and code based on
python2.7 on my google drive to discuss it with Matthew Brett. Maybe its
time for a broader discussion. IMHO it is ready for testing but not for
consumption.

url:
https://drive.google.com/folderview?id=0B4DmELLTwYmldUVpSjdpZlpNM1k&usp=sharing

contains:

(1) patches used

numpy.patch
scipy.patch


(2) 64 bit GCC toolchain
amd64/
mingw-w64-toolchain-static_amd64-gcc-4.8.2_vc90_rev-20140131.7z
libpython27.a

(3) numpy-1.8.0 linked against OpenBLAS
amd64/numpy-1.8.0/
numpy-1.8.0.win-amd64-py2.7.exe
numpy-1.8.0-cp27-none-win_amd64.whl
numpy_amd64_fcompiler.log
numpy_amd64_build.log
numpy_amd64_test.log
_numpyconfig.h
config.h

(4) scipy-0.13.3 linked against OpenBLAS
amd64/scipy-0.13.3/
scipy-0.13.3.win-amd64-py2.7.exe
scipy-0.13.3-cp27-none-win_amd64.whl
scipy_amd64_fcompiler.log
scipy_amd64_build.log
scipy_amd64_build_cont.log
scipy_amd64_test._segfault.log
scipy_amd64_test.log


(5) 32 bit GCC toolchain
win32/
mingw-w64-toolchain-static_win32-gcc-4.8.2_vc90_rev-20140131.7z
libpython27.a

(6) numpy-1.8.0 linked against OpenBLAS
win32/numpy-1.8.0/
numpy-1.8.0.win32-py2.7.exe
numpy-1.8.0-cp27-none-win32.whl
numpy_win32_fcompiler.log
numpy_win32_build.log
numpy_win32_test.log
_numpyconfig.h
config.h

(7) scipy-0.13.3 linked against OpenBLAS
win32/scipy-0.13.3/
scipy-0.13.3.win32-py2.7.exe
scipy-0.13.3-cp27-none-win32.whl
scipy_win32_fcompiler.log
scipy_win32_build.log
scipy_win32_build_cont.log
scipy_win32_test.log

Summary to compile numpy:

(1) <mingw>\bin and python should be in the PATH. Choose 32 bit or 64 bit
architecture.
(2) copy libpython27.a to <python>\libs
check, that <python>\libs does not contain libmsvcr90.a
(3) apply numpy.patch
(4) copy libopenblas.dll from <mingw>\bin to numpy\core
of course don't ever mix 32bit and 64 bit code
(5) create a site.cfg in the numpy folder with the absolute path to the
mingw import
files/header files. I copied the openblas header files, importlibs into
the GCC toolchain.
(6) create a mingw distutils.cfg file
(7) test the configuration
python setup.py config_fc --verbose
and
python setup.py build --help-fcompiler
(8) build
python setup.py build --fcompiler=gnu95
(9) make a distro
python setup.py bdist --format=wininst
(10) make a wheel
wininst2wheel numpy-1.8.0.win32-py2.7.exe (for 32 bit)
(11) install
wheel install numpy-1.8.0-cp27-none-win32.whl
(12) import numpy; numpy.test()

Summary to compile scipy:

(1) apply scipy.patch
(2) python setup.py build --fcompiler=gnu95
and a second time
python setup.py build --fcompiler=gnu95
(3) python setup.py bdist --format=wininst
(4) install
(5) import scipy; scipy.test()

Hints:

(1) libpython import file:

The libpython27.a import files has been generated with gendef and dlltool
according to the recommendations on the mingw-w64 faq site. It is
essential to not use import libraries from anywhere, but create it with the
tools in the GCC toolchain. The GCC toolchains contains correct generated
mscvrXX import files per default.

(2) OpenBLAS:

the openblas DLL must be copied to numpy/core before building numpy. All
Blas and Lapack code will be linked dynamically to this DLL. Because of
this the overall distro size gets much smaller compared to numpy-MKL or
scipy-MKL. It is not necessary to add numpy/core to the path! (at least on
my machine). To load libopenblas.dll to the process space it is only
necessary to import numpy - nothing else. libopenblas.dll is linked against
the msvcr90.dll, just like python. The DLL itself is a fat binary
containing all optimized kernels for all supported platforms. DLL, headers
and import files have been included into the toolchain.

(3) mingw-w64 toolchain:

In short it is an extended version of the 'recommended' mingw-builds
toolchain with some minor patches and customizations. I used
https://github.com/niXman/mingw-builds for my build. It is a 'statically'
build, thus all gcc related runtimes are linked statically into the
resulting binaries.

(4) Results:

Some FAILS - see corresp. log-files. I got a segfault with scipy.test() (64
bit) with multithreaded OpenBLAS (test_ssygv_1) but not in single threaded
mode. Due to time constraints I didn't made further tests right now.

Regards

Carl
Post by Sturla Molden
Will this mean NumPy, SciPy et al. can start using OpenBLAS in the
"official" binary packages, e.g. on Windows and Mac OS X? ATLAS is slow and
Accelerate conflicts with fork as well.
Will dotblas be built against OpenBLAS? AFAIK, it is only buit against
ATLAS or MKL, not any other BLAS, but it should just be a matter of
changing the build/link process.
Sturla
Post by Nathaniel Smith
Hey all,
Just a heads up: thanks to the tireless work of Olivier Grisel, the
OpenBLAS development branch is now fork-safe when built with its default
threading support. (It is still not thread-safe when built using OMP for
threading and gcc, but this is not the default.)
Gory details: <a
href="https://github.com/xianyi/OpenBLAS/issues/294">
https://github.com/xianyi/OpenBLAS/issues/294</a>
Post by Nathaniel Smith
Check it out - if it works you might want to consider lobbying your
favorite distro to backport it.
-n
_______________________________________________ NumPy-Discussion mailing
list
Post by Nathaniel Smith
href="http://mail.scipy.org/mailman/listinfo/numpy-discussion">
http://mail.scipy.org/mailman/listinfo/numpy-discussion</a>
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Olivier Grisel
2014-02-20 14:50:14 UTC
Permalink
Thanks for sharing, this is all very interesting.

Have you tried to have a look at the memory usage and import time of
numpy when linked against libopenblas.dll?
--
Olivier
Julian Taylor
2014-02-20 15:01:12 UTC
Permalink
On Thu, Feb 20, 2014 at 3:50 PM, Olivier Grisel
Post by Olivier Grisel
Thanks for sharing, this is all very interesting.
Have you tried to have a look at the memory usage and import time of
numpy when linked against libopenblas.dll?
--
this is probably caused by the memory warmup
it can be disabled with NO_WARMUP=1 in some configuration file.
Carl Kleffner
2014-02-20 15:02:30 UTC
Permalink
good point, I didn't used this option.

Carl
Post by Julian Taylor
On Thu, Feb 20, 2014 at 3:50 PM, Olivier Grisel
Post by Olivier Grisel
Thanks for sharing, this is all very interesting.
Have you tried to have a look at the memory usage and import time of
numpy when linked against libopenblas.dll?
--
this is probably caused by the memory warmup
it can be disabled with NO_WARMUP=1 in some configuration file.
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Olivier Grisel
2014-02-20 15:30:55 UTC
Permalink
Post by Julian Taylor
this is probably caused by the memory warmup
it can be disabled with NO_WARMUP=1 in some configuration file.
import os, psutil
psutil.Process(os.getpid()).get_memory_info().rss / 1e6
20.324352
Post by Julian Taylor
%time import numpy
CPU times: user 84 ms, sys: 464 ms, total: 548 ms
Wall time: 59.3 ms
Post by Julian Taylor
psutil.Process(os.getpid()).get_memory_info().rss / 1e6
27.906048

Thanks for the tip.
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel
Carl Kleffner
2014-02-20 15:01:25 UTC
Permalink
looked at the taskmanager there is not much difference to numpy-MKL. I
didn't made any qualified measurements however.

Carl
Post by Olivier Grisel
Thanks for sharing, this is all very interesting.
Have you tried to have a look at the memory usage and import time of
numpy when linked against libopenblas.dll?
--
Olivier
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Olivier Grisel
2014-02-20 22:17:10 UTC
Permalink
I had a quick look (without running the procedure) but I don't
understand some elements:

- apparently you never tell in the numpy's site.cfg nor the scipy.cfg
to use the openblas lib nor set the
library_dirs: how does numpy.distutils know that it should dynlink
against numpy/core/libopenblas.dll

- how to you deal with the link to the following libraries:
libgfortran.3.dll
libgcc_s.1.dll
libquadmath.0.dll

If MinGW is installed on the system I assume that the linker will find
them. But would it work when the wheel packages are installed on a
system that does not have MinGW installed?

Best,
--
Olivier
Carl Kleffner
2014-02-20 22:56:17 UTC
Permalink
Hi,
Post by Olivier Grisel
I had a quick look (without running the procedure) but I don't
- apparently you never tell in the numpy's site.cfg nor the scipy.cfg
to use the openblas lib nor set the
library_dirs: how does numpy.distutils know that it should dynlink
against numpy/core/libopenblas.dll
numpy's site.cfg is something like: (64 bit)

[openblas]
libraries = openblas
library_dirs = D:/devel/mingw64static/x86_64-w64-mingw32/lib
include_dirs = D:/devel/mingw64static/x86_64-w64-mingw32/include

or (32 bit)

[openblas]
libraries = openblas
library_dirs = D:/devel32/mingw32static/i686-w64-mingw32/lib
include_dirs = D:/devel32/mingw32static/i686-w64-mingw32/include

Please adapt the paths of course and apply the patches to numpy.
Post by Olivier Grisel
libgfortran.3.dll
libgcc_s.1.dll
libquadmath.0.dll
You won't need them. I build the toolchain statically. Thus you don't have
to mess up with GCC runtime libs. You can check the dependencies with MS
depends or with ntldd (included in the toolchain)
Post by Olivier Grisel
If MinGW is installed on the system I assume that the linker will find
them. But would it work when the wheel packages are installed on a
system that does not have MinGW installed?
The wheels should be sufficient regardless if you have mingw installed or
not.

with best Regards

Carl
Olivier Grisel
2014-02-21 11:41:39 UTC
Permalink
Post by Carl Kleffner
Hi,
Post by Olivier Grisel
I had a quick look (without running the procedure) but I don't
- apparently you never tell in the numpy's site.cfg nor the scipy.cfg
to use the openblas lib nor set the
library_dirs: how does numpy.distutils know that it should dynlink
against numpy/core/libopenblas.dll
numpy's site.cfg is something like: (64 bit)
[openblas]
libraries = openblas
library_dirs = D:/devel/mingw64static/x86_64-w64-mingw32/lib
include_dirs = D:/devel/mingw64static/x86_64-w64-mingw32/include
or (32 bit)
[openblas]
libraries = openblas
library_dirs = D:/devel32/mingw32static/i686-w64-mingw32/lib
include_dirs = D:/devel32/mingw32static/i686-w64-mingw32/include
Thanks, what I don't understand is how the libopenblas.dll will be
found at runtime. Is it a specific "feature" of windows? For instance
how would the scipy/linalg/_*.so file know that the libopenblas.dll
can be found in $PYTHONPATH/numpy/core?
Post by Carl Kleffner
Please adapt the paths of course and apply the patches to numpy.
Post by Olivier Grisel
libgfortran.3.dll
libgcc_s.1.dll
libquadmath.0.dll
You won't need them. I build the toolchain statically. Thus you don't have
to mess up with GCC runtime libs. You can check the dependencies with MS
depends or with ntldd (included in the toolchain)
Great! I did not know it was possible. I guess that if we want to
replicate that for Linux and Mac we will have to also build custom
static GCC toolchains as well. Is there a good reference doc somewhere
on how to do so? When googling I only find posts by people who cannot
make their toolchain build statically correctly.
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel
Olivier Grisel
2014-03-26 15:27:56 UTC
Permalink
Hi Carl,

I installed Python 2.7.6 64 bits on a windows server instance from
rackspace cloud and then ran get-pip.py and then could successfully
install the numpy and scipy wheel packages from your google drive
folder. I tested dot products and scipy.linalg.svd and they work as
expected.

Then I uncompressed your mingw toolchain in c:\mingw, put c:\mingw\bin
in my PATH and tried to build the scikit-learn git master with it,
however it fails with:

building 'sklearn.__check_build._check_build' extension
compiling C sources
C compiler: gcc -DMS_WIN64 -O2 -msse -msse2 -Wall -Wstrict-prototypes

compile options: '-D__MSVCRT_VERSION__=0x0900
-Ic:\Python27\lib\site-packages\numpy\core\include
-Ic:\Python27\lib\site-packages\numpy\core\include -Ic:\Python2
7\include -Ic:\Python27\PC -c'
gcc -DMS_WIN64 -O2 -msse -msse2 -Wall -Wstrict-prototypes
-D__MSVCRT_VERSION__=0x0900
-Ic:\Python27\lib\site-packages\numpy\core\include
-Ic:\Python27\lib\site-
packages\numpy\core\include -Ic:\Python27\include -Ic:\Python27\PC -c
sklearn\__check_build\_check_build.c -o
build\temp.win-amd64-2.7\Release\sklearn\__check_b
uild\_check_build.o
Found executable c:\mingw\bin\gcc.exe
gcc -shared -Wl,-gc-sections -Wl,-s
build\temp.win-amd64-2.7\Release\sklearn\__check_build\_check_build.o
-Lc:\Python27\libs -Lc:\Python27\PCbuild\amd64 -Lbuild
\temp.win-amd64-2.7 -lpython27 -lmsvcr90 -o
build\lib.win-amd64-2.7\sklearn\__check_build\_check_build.pyd
build\temp.win-amd64-2.7\Release\sklearn\__check_build\_check_build.o:_check_build.c:(.text+0x3):
undefined reference to `__imp__Py_NoneStruct'
build\temp.win-amd64-2.7\Release\sklearn\__check_build\_check_build.o:_check_build.c:(.text+0x1ca):
undefined reference to `__imp__PyThreadState_Current'
build\temp.win-amd64-2.7\Release\sklearn\__check_build\_check_build.o:_check_build.c:(.text+0x405):
undefined reference to `__imp_PyExc_ImportError'
c:/mingw/bin/../lib/gcc/x86_64-w64-mingw32/4.8.2/../../../../x86_64-w64-mingw32/bin/ld.exe:
build\temp.win-amd64-2.7\Release\sklearn\__check_build\_check_build.
o: bad reloc address 0x0 in section `.data'
collect2.exe: error: ld returned 1 exit status
error: Command "gcc -shared -Wl,-gc-sections -Wl,-s
build\temp.win-amd64-2.7\Release\sklearn\__check_build\_check_build.o
-Lc:\Python27\libs -Lc:\Python27\PCbui
ld\amd64 -Lbuild\temp.win-amd64-2.7 -lpython27 -lmsvcr90 -o
build\lib.win-amd64-2.7\sklearn\__check_build\_check_build.pyd" failed
with exit status 1

Furthermore, when I try to introspect the blas information on this box I get:

In [1]: import scipy
C:\Python27\lib\site-packages\numpy\core\__init__.py:6: Warning: Numpy
64bit experimental build with Mingw-w64 and OpenBlas. Use with care.
from . import multiarray
OpenBLAS : Your OS does not support AVX instructions. OpenBLAS is
using Barcelona kernels as a fallback, which may give poorer
performance.

In [2]: scipy.show_config()
umfpack_info:
NOT AVAILABLE
lapack_opt_info:
libraries = ['openblas', 'openblas']
library_dirs = ['D:/devel/mingw64static/x86_64-w64-mingw32/lib']
language = f77
blas_opt_info:
libraries = ['openblas', 'openblas']
library_dirs = ['D:/devel/mingw64static/x86_64-w64-mingw32/lib']
language = f77
openblas_info:
libraries = ['openblas', 'openblas']
library_dirs = ['D:/devel/mingw64static/x86_64-w64-mingw32/lib']
language = f77
blas_mkl_info:
NOT AVAILABLE

In [3]: from numpy.distutils.system_info import get_info

In [4]: get_info('blas_opt')
C:\Python27\lib\site-packages\numpy\distutils\system_info.py:576:
UserWarning: Specified path
D:/devel/mingw64static/x86_64-w64-mingw32/lib is invalid.
warnings.warn('Specified path %s is invalid.' % d)
C:\Python27\lib\site-packages\numpy\distutils\system_info.py:1522: UserWarning:
Atlas (http://math-atlas.sourceforge.net/) libraries not found.
Directories to search for the libraries can be specified in the
numpy/distutils/site.cfg file (section [atlas]) or by setting
the ATLAS environment variable.
warnings.warn(AtlasNotFoundError.__doc__)
C:\Python27\lib\site-packages\numpy\distutils\system_info.py:1531: UserWarning:
Blas (http://www.netlib.org/blas/) libraries not found.
Directories to search for the libraries can be specified in the
numpy/distutils/site.cfg file (section [blas]) or by setting
the BLAS environment variable.
warnings.warn(BlasNotFoundError.__doc__)
C:\Python27\lib\site-packages\numpy\distutils\system_info.py:1534: UserWarning:
Blas (http://www.netlib.org/blas/) sources not found.
Directories to search for the sources can be specified in the
numpy/distutils/site.cfg file (section [blas_src]) or by setting
the BLAS_SRC environment variable.
warnings.warn(BlasSrcNotFoundError.__doc__)
Out[4]: {}

Would it make sense to embed the blas and lapack header files as part
of this numpy wheel and make numpy.distutils.system_info return the
lib and include folder pointing to the embedded libopenblas.dll and
header files so has to make third party libraries directly buildable
against those?
--
Olivier
Julian Taylor
2014-03-26 18:34:39 UTC
Permalink
Post by Olivier Grisel
Hi Carl,
I installed Python 2.7.6 64 bits on a windows server instance from
rackspace cloud and then ran get-pip.py and then could successfully
install the numpy and scipy wheel packages from your google drive
folder. I tested dot products and scipy.linalg.svd and they work as
expected.
Would it make sense to embed the blas and lapack header files as part
of this numpy wheel and make numpy.distutils.system_info return the
lib and include folder pointing to the embedded libopenblas.dll and
header files so has to make third party libraries directly buildable
against those?
as for using openblas by default in binary builds, no.
pthread openblas build is now fork safe which is great but it is still
not reliable enough for a default.
E.g. the current latest release 0.2.8 still has one crash bug on
dgemv[1], and wrong results zherk/zer2[2] and dgemv/cgemv[3].
git head has the former four fixed bug still has wrong results for cgemv.
The not so old 0.2.8 also fixed whole bunch more crashes and wrong
result issues (crashes on QR, uninitialized data use in dgemm, ...).
None of the fixes received unit tests, so I am somewhat pessimistic that
it will improve, especially as the maintainer is dissertating (is that
the right word?) and most of the code is assembler code only few people
can write (it is simply not required anymore, we have code generators
and intrinsics for that).

Openblas is great if you do not have the patience to build ATLAS and
only use a restricted set of functionality and platforms you can easily
test.
Currently it is in my opinion not suitable for a general purpose library
like numpy.

I don't have any objections to adding get_info("openblas") if that does
not work yet. Patches welcome.

[0] https://github.com/xianyi/OpenBLAS/issues/304
[1] https://github.com/xianyi/OpenBLAS/issues/333
[2] https://github.com/xianyi/OpenBLAS/issues/340
Nathaniel Smith
2014-03-26 20:41:28 UTC
Permalink
On Wed, Mar 26, 2014 at 7:34 PM, Julian Taylor
Post by Julian Taylor
as for using openblas by default in binary builds, no.
pthread openblas build is now fork safe which is great but it is still
not reliable enough for a default.
E.g. the current latest release 0.2.8 still has one crash bug on
dgemv[1], and wrong results zherk/zer2[2] and dgemv/cgemv[3].
git head has the former four fixed bug still has wrong results for cgemv.
The not so old 0.2.8 also fixed whole bunch more crashes and wrong
result issues (crashes on QR, uninitialized data use in dgemm, ...).
None of the fixes received unit tests, so I am somewhat pessimistic that
it will improve, especially as the maintainer is dissertating (is that
the right word?) and most of the code is assembler code only few people
can write (it is simply not required anymore, we have code generators
and intrinsics for that).
Openblas is great if you do not have the patience to build ATLAS and
only use a restricted set of functionality and platforms you can easily
test.
Currently it is in my opinion not suitable for a general purpose library
like numpy.
Those problems you list are pretty damning, but neither is it
reasonable to expect everyone to manually build ATLAS on every machine
they use (or their students use, or...) :-(. So what other options do
we have for general purpose builds? Give up and use MKL? How's
eigen-blas doing these days? (I guess from skimming their docs they
use OpenMP?)
--
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
Julian Taylor
2014-03-26 21:08:11 UTC
Permalink
Post by Nathaniel Smith
On Wed, Mar 26, 2014 at 7:34 PM, Julian Taylor
Post by Julian Taylor
as for using openblas by default in binary builds, no.
pthread openblas build is now fork safe which is great but it is still
not reliable enough for a default.
E.g. the current latest release 0.2.8 still has one crash bug on
dgemv[1], and wrong results zherk/zer2[2] and dgemv/cgemv[3].
git head has the former four fixed bug still has wrong results for cgemv.
The not so old 0.2.8 also fixed whole bunch more crashes and wrong
result issues (crashes on QR, uninitialized data use in dgemm, ...).
None of the fixes received unit tests, so I am somewhat pessimistic that
it will improve, especially as the maintainer is dissertating (is that
the right word?) and most of the code is assembler code only few people
can write (it is simply not required anymore, we have code generators
and intrinsics for that).
Openblas is great if you do not have the patience to build ATLAS and
only use a restricted set of functionality and platforms you can easily
test.
Currently it is in my opinion not suitable for a general purpose library
like numpy.
Those problems you list are pretty damning, but neither is it
reasonable to expect everyone to manually build ATLAS on every machine
they use (or their students use, or...) :-(. So what other options do
we have for general purpose builds? Give up and use MKL? How's
eigen-blas doing these days? (I guess from skimming their docs they
use OpenMP?)
I don't think general purpose builds need to have perfect performance.
we should provide something that works and allow users to tune it when
required. The slower general purpose build is also a great testcase to
verify that the tuned build works for your problem.

I didn't notice this is a reply to third party provided win64 binaries
with openblas. I though it was about official numpy binaries with
openblas again.
Third party binaries using openblas are great, especially these that
seem to warn that this is experimental.
It helps ironing out the kinks of openblas. Thanks for providing them.
Olivier Grisel
2014-03-26 21:17:46 UTC
Permalink
My understanding of Carl's effort is that the long term goal is to
have official windows whl packages for both numpy and scipy published
on PyPI with a builtin BLAS / LAPACK implementation so that users can
do `pip install scipy` under windows and get something that just works
without have to install any compiler (fortran or C) nor any additional
library manually.

Most windows users are beginners and you cannot really expect them to
understand how to build the whole scipy stack from source.

The current solution (executable setup installers) is not optimal as
it requires Administrator rights to run, does not resolve dependencies
as pip does and cannot be installed in virtualenvs.

If we can build numpy / scipy whl packages for windows with the Atlas
dlls then fine embedded in the numpy package then good. It does not
need to be the
fastest BLAS / LAPACK lib in my opinion. Just something that works.

The problem with ATLAS is that you need to select the number of thread
at build time AFAIK. But we could set it to a reasonable default (e.g.
4 threads) for the default windows package.
--
Olivier
Julian Taylor
2014-03-26 21:31:08 UTC
Permalink
Post by Olivier Grisel
The problem with ATLAS is that you need to select the number of thread
at build time AFAIK. But we could set it to a reasonable default (e.g.
4 threads) for the default windows package.
You have to set the number of threads at build time with OpenBlas too.
At runtime it then selects the number of online cpus but limited to the
build time maximum.
It defaults to the maximum of the machine it was built on. You need to
explicitly override that for generic binaries. (I think debian binaries
uses 64 which is probably reasonable for non MIC systems)
Olivier Grisel
2014-03-26 21:35:47 UTC
Permalink
Post by Julian Taylor
Post by Olivier Grisel
The problem with ATLAS is that you need to select the number of thread
at build time AFAIK. But we could set it to a reasonable default (e.g.
4 threads) for the default windows package.
You have to set the number of threads at build time with OpenBlas too.
At runtime it then selects the number of online cpus but limited to the
build time maximum.
It defaults to the maximum of the machine it was built on. You need to
explicitly override that for generic binaries. (I think debian binaries
uses 64 which is probably reasonable for non MIC systems)
Yes, the official windows binary for OpenBLAS is also build with a
maximum number of threads of 64 (NUM_THREADS=64) which I find
reasonable since the runtime number of threads will be capped by the
actual number of cores.

For ATLAS I don't think there is a runtime cap, or is there?
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel
j***@gmail.com
2014-03-27 13:55:52 UTC
Permalink
On Wed, Mar 26, 2014 at 5:17 PM, Olivier Grisel
Post by Olivier Grisel
My understanding of Carl's effort is that the long term goal is to
have official windows whl packages for both numpy and scipy published
on PyPI with a builtin BLAS / LAPACK implementation so that users can
do `pip install scipy` under windows and get something that just works
without have to install any compiler (fortran or C) nor any additional
library manually.
Most windows users are beginners and you cannot really expect them to
understand how to build the whole scipy stack from source.
The current solution (executable setup installers) is not optimal as
it requires Administrator rights to run, does not resolve dependencies
as pip does and cannot be installed in virtualenvs.
as small related point:

The official installers can be used to install in virtualenv
The way I do it:
Run the superpack, official installer, wait until it extracts the
correct (SSE) install exe, then cancel
Then easy_install the install exe file that has been extracted to the
temp folder into the virtualenv.

I don't remember if the extraction already requires admin rights, but
I think not.
easy_install doesn't require any, IIRC.

Josef
Post by Olivier Grisel
If we can build numpy / scipy whl packages for windows with the Atlas
dlls then fine embedded in the numpy package then good. It does not
need to be the
fastest BLAS / LAPACK lib in my opinion. Just something that works.
The problem with ATLAS is that you need to select the number of thread
at build time AFAIK. But we could set it to a reasonable default (e.g.
4 threads) for the default windows package.
--
Olivier
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Olivier Grisel
2014-03-27 13:59:26 UTC
Permalink
Post by j***@gmail.com
On Wed, Mar 26, 2014 at 5:17 PM, Olivier Grisel
Post by Olivier Grisel
My understanding of Carl's effort is that the long term goal is to
have official windows whl packages for both numpy and scipy published
on PyPI with a builtin BLAS / LAPACK implementation so that users can
do `pip install scipy` under windows and get something that just works
without have to install any compiler (fortran or C) nor any additional
library manually.
Most windows users are beginners and you cannot really expect them to
understand how to build the whole scipy stack from source.
The current solution (executable setup installers) is not optimal as
it requires Administrator rights to run, does not resolve dependencies
as pip does and cannot be installed in virtualenvs.
The official installers can be used to install in virtualenv
Run the superpack, official installer, wait until it extracts the
correct (SSE) install exe, then cancel
Then easy_install the install exe file that has been extracted to the
temp folder into the virtualenv.
I don't remember if the extraction already requires admin rights, but
I think not.
easy_install doesn't require any, IIRC.
Hackish but interesting. Maybe the extraction can be done with generic
tools like winzip?
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel
j***@gmail.com
2014-03-27 14:13:29 UTC
Permalink
On Thu, Mar 27, 2014 at 9:59 AM, Olivier Grisel
Post by Olivier Grisel
Post by j***@gmail.com
On Wed, Mar 26, 2014 at 5:17 PM, Olivier Grisel
Post by Olivier Grisel
My understanding of Carl's effort is that the long term goal is to
have official windows whl packages for both numpy and scipy published
on PyPI with a builtin BLAS / LAPACK implementation so that users can
do `pip install scipy` under windows and get something that just works
without have to install any compiler (fortran or C) nor any additional
library manually.
Most windows users are beginners and you cannot really expect them to
understand how to build the whole scipy stack from source.
The current solution (executable setup installers) is not optimal as
it requires Administrator rights to run, does not resolve dependencies
as pip does and cannot be installed in virtualenvs.
The official installers can be used to install in virtualenv
Run the superpack, official installer, wait until it extracts the
correct (SSE) install exe, then cancel
Then easy_install the install exe file that has been extracted to the
temp folder into the virtualenv.
I don't remember if the extraction already requires admin rights, but
I think not.
easy_install doesn't require any, IIRC.
Hackish but interesting. Maybe the extraction can be done with generic
tools like winzip?
I tried to open and unzip with WinRAR but couldn't make sense of the content.

BTW: easy_install for other installers like matplotlib also works
nicely for virtualenv

----

However, the official installers are only for 32-bit python, and I
appreciate all the efforts to "modernize" the numpy and scipy builds.

Josef
Post by Olivier Grisel
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Matthew Brett
2014-03-28 11:51:27 UTC
Permalink
Hi,
Post by Nathaniel Smith
On Wed, Mar 26, 2014 at 7:34 PM, Julian Taylor
Post by Julian Taylor
as for using openblas by default in binary builds, no.
pthread openblas build is now fork safe which is great but it is still
not reliable enough for a default.
E.g. the current latest release 0.2.8 still has one crash bug on
dgemv[1], and wrong results zherk/zer2[2] and dgemv/cgemv[3].
git head has the former four fixed bug still has wrong results for cgemv.
The not so old 0.2.8 also fixed whole bunch more crashes and wrong
result issues (crashes on QR, uninitialized data use in dgemm, ...).
None of the fixes received unit tests, so I am somewhat pessimistic that
it will improve, especially as the maintainer is dissertating (is that
the right word?) and most of the code is assembler code only few people
can write (it is simply not required anymore, we have code generators
and intrinsics for that).
Openblas is great if you do not have the patience to build ATLAS and
only use a restricted set of functionality and platforms you can easily
test.
Currently it is in my opinion not suitable for a general purpose library
like numpy.
Those problems you list are pretty damning, but neither is it
reasonable to expect everyone to manually build ATLAS on every machine
they use (or their students use, or...) :-(. So what other options do
we have for general purpose builds? Give up and use MKL? How's
eigen-blas doing these days? (I guess from skimming their docs they
use OpenMP?)
I see it should be possible to build a full blas and partial lapack
library with eigen [1] [2].

Does anyone know how their performance compares to MKL or the
reference implementations?

Carl - have you tried building eigen with your custom tool chain?

Cheers,

Matthew

[1] http://eigen.tuxfamily.org/index.php?title=3.0
[2] http://stackoverflow.com/questions/20441851/build-numpy-with-eigen-instead-of-atlas-or-openblas
Sturla Molden
2014-03-28 14:54:45 UTC
Permalink
Post by Matthew Brett
I see it should be possible to build a full blas and partial lapack
library with eigen [1] [2].
Eigen has a licensing issue as well, unfortunately, MPL2.

E.g. it requires recipients to be informed of the MPL requirements (cf.
impossible with pip install numpy).

Sturla
Alan G Isaac
2014-03-28 15:31:50 UTC
Permalink
Post by Sturla Molden
Eigen has a licensing issue as well, unfortunately, MPL2.
E.g. it requires recipients to be informed of the MPL requirements (cf.
impossible with pip install numpy).
Eigen chose MPL2 with the intent that Eigen be usable by
"all projects".
http://eigen.tuxfamily.org/index.php?title=Licensing_FAQ
If you are correct in your interpretation, it may be worth
raising the issue and requesting the needed accommodation.

Alan
Robert Kern
2014-03-28 15:48:58 UTC
Permalink
Post by Alan G Isaac
Post by Sturla Molden
Eigen has a licensing issue as well, unfortunately, MPL2.
E.g. it requires recipients to be informed of the MPL requirements (cf.
impossible with pip install numpy).
Eigen chose MPL2 with the intent that Eigen be usable by
"all projects".
http://eigen.tuxfamily.org/index.php?title=Licensing_FAQ
If you are correct in your interpretation, it may be worth
raising the issue and requesting the needed accommodation.
The authors of Eigen are familiar with our policy on this matter. See
the thread following this email:

http://mail.scipy.org/pipermail/numpy-discussion/2010-January/047958.html

The change from LGPL to MPL2 isn't relevant to our policy. Both have
more restrictions and conditions than the BSD license.
--
Robert Kern
Robert Kern
2014-03-28 15:58:15 UTC
Permalink
Post by Sturla Molden
Post by Matthew Brett
I see it should be possible to build a full blas and partial lapack
library with eigen [1] [2].
Eigen has a licensing issue as well, unfortunately, MPL2.
E.g. it requires recipients to be informed of the MPL requirements (cf.
impossible with pip install numpy).
That's not the relevant condition. That's easily taken care of by
including the MPL2 license text in the binary alongside numpy's BSD
license text. This is no different than numpy's BSD license itself,
which requires that the license text be included. It's not like people
can't distribute any MPL2 project on PyPI just because pip doesn't
print out the license before installing.

The extra-BSD conditions of the MPL2 are sections 3.1 and 3.2.
--
Robert Kern
Nathaniel Smith
2014-03-28 18:43:00 UTC
Permalink
Post by Robert Kern
Post by Sturla Molden
Post by Matthew Brett
I see it should be possible to build a full blas and partial lapack
library with eigen [1] [2].
Eigen has a licensing issue as well, unfortunately, MPL2.
E.g. it requires recipients to be informed of the MPL requirements (cf.
impossible with pip install numpy).
That's not the relevant condition. That's easily taken care of by
including the MPL2 license text in the binary alongside numpy's BSD
license text. This is no different than numpy's BSD license itself,
which requires that the license text be included. It's not like people
can't distribute any MPL2 project on PyPI just because pip doesn't
print out the license before installing.
The extra-BSD conditions of the MPL2 are sections 3.1 and 3.2.
Those requirements just say that in addition to including the MPL2
license text, we also have to include a notice saying where the source
code is available, i.e. the package would have to somewhere include a
link to eigen.org.

https://www.mozilla.org/MPL/2.0/FAQ.html#distribute-my-binaries

I'm not sure why this would be a problem.

-n
--
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
Robert Kern
2014-03-28 19:26:02 UTC
Permalink
It's only a problem in that the binary will not be BSD, and we do need to
communicate that appropriately. It will contain a significant component
that is MPL2 licensed. The terms that force us to include the link to the
Eigen source that we used forces downstream redistributors of the binary to
do the same. Now, of all the copyleft licenses, this is certainly the most
friendly, but it is not BSD.
Post by Nathaniel Smith
Post by Robert Kern
Post by Sturla Molden
Post by Matthew Brett
I see it should be possible to build a full blas and partial lapack
library with eigen [1] [2].
Eigen has a licensing issue as well, unfortunately, MPL2.
E.g. it requires recipients to be informed of the MPL requirements (cf.
impossible with pip install numpy).
That's not the relevant condition. That's easily taken care of by
including the MPL2 license text in the binary alongside numpy's BSD
license text. This is no different than numpy's BSD license itself,
which requires that the license text be included. It's not like people
can't distribute any MPL2 project on PyPI just because pip doesn't
print out the license before installing.
The extra-BSD conditions of the MPL2 are sections 3.1 and 3.2.
Those requirements just say that in addition to including the MPL2
license text, we also have to include a notice saying where the source
code is available, i.e. the package would have to somewhere include a
link to eigen.org.
https://www.mozilla.org/MPL/2.0/FAQ.html#distribute-my-binaries
I'm not sure why this would be a problem.
-n
--
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Matthew Brett
2014-03-28 19:32:31 UTC
Permalink
Hi,
Post by Robert Kern
It's only a problem in that the binary will not be BSD, and we do need to
communicate that appropriately. It will contain a significant component that
is MPL2 licensed. The terms that force us to include the link to the Eigen
source that we used forces downstream redistributors of the binary to do the
same. Now, of all the copyleft licenses, this is certainly the most
friendly, but it is not BSD.
I think the binary would be BSD because of section 3.2:

"You may distribute such Executable Form under the terms of this
License, or sublicense it under different terms, provided that the
license for the Executable Form does not attempt to limit or alter the
recipients' rights in the Source Code Form under this License."

I think this is specifically saying - as long as our license (BSD)
does not try and limit access to Eigen source, we can distribute our
binary under our license.

Cheers,

Matthew
Robert Kern
2014-03-28 19:37:34 UTC
Permalink
The BSD license alters the recipient's rights. BSD binaries can be
redistributed without pointing to the sources.
Post by Matthew Brett
Hi,
Post by Robert Kern
It's only a problem in that the binary will not be BSD, and we do need to
communicate that appropriately. It will contain a significant component
that
Post by Robert Kern
is MPL2 licensed. The terms that force us to include the link to the
Eigen
Post by Robert Kern
source that we used forces downstream redistributors of the binary to do
the
Post by Robert Kern
same. Now, of all the copyleft licenses, this is certainly the most
friendly, but it is not BSD.
"You may distribute such Executable Form under the terms of this
License, or sublicense it under different terms, provided that the
license for the Executable Form does not attempt to limit or alter the
recipients' rights in the Source Code Form under this License."
I think this is specifically saying - as long as our license (BSD)
does not try and limit access to Eigen source, we can distribute our
binary under our license.
Cheers,
Matthew
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Nathaniel Smith
2014-03-28 19:34:16 UTC
Permalink
Post by Robert Kern
It's only a problem in that the binary will not be BSD, and we do need to
communicate that appropriately. It will contain a significant component
that is MPL2 licensed. The terms that force us to include the link to the
Eigen source that we used forces downstream redistributors of the binary to
do the same. Now, of all the copyleft licenses, this is certainly the most
friendly, but it is not BSD.

AFAICT, the only way redistributers could violate the MPL would be if they
unpacked our binary and deleted the license file. But this would also be a
violation of the BSD. The only difference in terms of requirements on
redistributors between MPL and BSD seems to be exactly *which* text you
include in your license file.

I don't know if Eigen is a good choice on technical grounds (or even a
possible one - has anyone ever actually compiled numpy against it?), but
this license thing just doesn't seem like an important issue to me, if the
alternative is not providing useful binaries.

-n
Robert Kern
2014-03-28 19:40:06 UTC
Permalink
No, the license does not contain a pointer to the Eigen sources, which is
required.

https://bitbucket.org/eigen/eigen/src/fabd880592ac3343713cc07e7287098afd0f18ca/COPYING.MPL2?at=default
Post by Robert Kern
It's only a problem in that the binary will not be BSD, and we do need
to communicate that appropriately. It will contain a significant component
that is MPL2 licensed. The terms that force us to include the link to the
Eigen source that we used forces downstream redistributors of the binary to
do the same. Now, of all the copyleft licenses, this is certainly the most
friendly, but it is not BSD.
AFAICT, the only way redistributers could violate the MPL would be if they
unpacked our binary and deleted the license file. But this would also be a
violation of the BSD. The only difference in terms of requirements on
redistributors between MPL and BSD seems to be exactly *which* text you
include in your license file.
I don't know if Eigen is a good choice on technical grounds (or even a
possible one - has anyone ever actually compiled numpy against it?), but
this license thing just doesn't seem like an important issue to me, if the
alternative is not providing useful binaries.
-n
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Nathaniel Smith
2014-03-28 20:11:45 UTC
Permalink
Yes, because they're distributing source. But *our* license file could
contain the text of the BSD, the text of the MPL, and the text "Eigen
source is available at http://eigen.org."

If the only problem with eigen turns out to be that we have to add a line
of text to a file then I think we can probably manage this somehow.

-n
Post by Robert Kern
No, the license does not contain a pointer to the Eigen sources, which is
required.
https://bitbucket.org/eigen/eigen/src/fabd880592ac3343713cc07e7287098afd0f18ca/COPYING.MPL2?at=default
Post by Robert Kern
It's only a problem in that the binary will not be BSD, and we do need
to communicate that appropriately. It will contain a significant component
that is MPL2 licensed. The terms that force us to include the link to the
Eigen source that we used forces downstream redistributors of the binary to
do the same. Now, of all the copyleft licenses, this is certainly the most
friendly, but it is not BSD.
AFAICT, the only way redistributers could violate the MPL would be if
they unpacked our binary and deleted the license file. But this would also
be a violation of the BSD. The only difference in terms of requirements on
redistributors between MPL and BSD seems to be exactly *which* text you
include in your license file.
I don't know if Eigen is a good choice on technical grounds (or even a
possible one - has anyone ever actually compiled numpy against it?), but
this license thing just doesn't seem like an important issue to me, if the
alternative is not providing useful binaries.
-n
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Sturla Molden
2014-03-28 20:28:09 UTC
Permalink
Post by Nathaniel Smith
If the only problem with eigen turns out to be that we have to add a line
of text to a file then I think we can probably manage this somehow.
We would also have to compile Eigen-BLAS for various architectures and CPU
counts. It is not "adaptive" like MKL or OpenBLAS.

Sturla
Matthew Brett
2014-03-28 21:16:07 UTC
Permalink
Hi,
Post by Sturla Molden
Post by Nathaniel Smith
If the only problem with eigen turns out to be that we have to add a line
of text to a file then I think we can probably manage this somehow.
We would also have to compile Eigen-BLAS for various architectures and CPU
counts. It is not "adaptive" like MKL or OpenBLAS.
Yes, I guess we currently have no idea how bad a default Eigen would be.

We also have the soft constraint that any choice we make should also
work for building scipy binaries - so adequate lapack coverage.

I believe that means lapack_lite is not an option?

So I guess the options are:

* eigen - could it be slow?
* openblas - could it be buggy?
* reference blas / lapack [1] [2] [3]

In [2] someone seems to be getting very good performance from the
reference implementation.

I guess we need to benchmark these guys on some standard systems, and
decide how bad the performance / stability has to be before it's
better not to provide binaries at all.

Cheers,

Matthew

[1] http://icl.cs.utk.edu/lapack-for-windows/lapack/
[2] http://ylzhao.blogspot.com/2013/10/blas-lapack-precompiled-binaries-for.html
[3] http://www.fi.muni.cz/~xsvobod2/misc/lapack/
Nathaniel Smith
2014-03-28 21:18:26 UTC
Permalink
I thought OpenBLAS is usually used with reference lapack?
Post by Matthew Brett
Hi,
Post by Sturla Molden
Post by Nathaniel Smith
If the only problem with eigen turns out to be that we have to add a
line
Post by Sturla Molden
Post by Nathaniel Smith
of text to a file then I think we can probably manage this somehow.
We would also have to compile Eigen-BLAS for various architectures and
CPU
Post by Sturla Molden
counts. It is not "adaptive" like MKL or OpenBLAS.
Yes, I guess we currently have no idea how bad a default Eigen would be.
We also have the soft constraint that any choice we make should also
work for building scipy binaries - so adequate lapack coverage.
I believe that means lapack_lite is not an option?
* eigen - could it be slow?
* openblas - could it be buggy?
* reference blas / lapack [1] [2] [3]
In [2] someone seems to be getting very good performance from the
reference implementation.
I guess we need to benchmark these guys on some standard systems, and
decide how bad the performance / stability has to be before it's
better not to provide binaries at all.
Cheers,
Matthew
[1] http://icl.cs.utk.edu/lapack-for-windows/lapack/
[2]
http://ylzhao.blogspot.com/2013/10/blas-lapack-precompiled-binaries-for.html
[3] http://www.fi.muni.cz/~xsvobod2/misc/lapack/
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Olivier Grisel
2014-03-28 21:38:34 UTC
Permalink
Post by Nathaniel Smith
I thought OpenBLAS is usually used with reference lapack?
I am no longer sure myself. Debian & thus Ubuntu seem to be only
packaging the BLAS part of OpenBLAS for the libblas.so symlink and
uses the reference implementation of lapack for the liblapack.so
symlink.

I observed a sparse leastsqr bug when linking scipy against the full
OpenBLAS under OSX that I could not reproduce under with the openblas
+ lapack combo shipped Ubuntu so there might be a difference. But that
could also be caused by a version / platform discrepancy between my
two setups.
--
Olivier
Julian Taylor
2014-03-28 21:55:44 UTC
Permalink
Post by Olivier Grisel
Post by Nathaniel Smith
I thought OpenBLAS is usually used with reference lapack?
I am no longer sure myself. Debian & thus Ubuntu seem to be only
packaging the BLAS part of OpenBLAS for the libblas.so symlink and
uses the reference implementation of lapack for the liblapack.so
symlink.
You can link the reference lapack with any library providing a BLAS
compatible API/ABI. ATLAS and OpenBlas are ABI compatible with reference
BLAS which allows replacing the library via LD_PRELOAD or debian
alternatives without recompiling.

Both ATLAS and OpenBlas provide a subset of optimized lapack functions,
but they are optional. On Debian/Ubuntu you can install ATLAS lapack but
then you are also forced to use ATLAS blas.

I am not familiar with how relevant the optimized parts of lapack for
the general use case.
Post by Olivier Grisel
I observed a sparse leastsqr bug when linking scipy against the full
OpenBLAS under OSX that I could not reproduce under with the openblas
+ lapack combo shipped Ubuntu so there might be a difference. But that
could also be caused by a version / platform discrepancy between my
two setups.
what kind of a bug? wrong result or crash? Which target did openblas use
when compiling on macos? or was it a dynamic build? (see the name of the
built static library)
The adaptive nature of OpenBLAS can make reproducing issues tricky, I
don't think there is a way to force it to use a certain kernel at
runtime besides recompiling for a different target.

If you have a testcase I'd be interested in it, as I'm trying to get
openblas into a decent shape for ubuntu 14.04.
Olivier Grisel
2014-03-28 22:05:42 UTC
Permalink
Post by Julian Taylor
Post by Olivier Grisel
Post by Nathaniel Smith
I thought OpenBLAS is usually used with reference lapack?
I am no longer sure myself. Debian & thus Ubuntu seem to be only
packaging the BLAS part of OpenBLAS for the libblas.so symlink and
uses the reference implementation of lapack for the liblapack.so
symlink.
You can link the reference lapack with any library providing a BLAS
compatible API/ABI. ATLAS and OpenBlas are ABI compatible with reference
BLAS which allows replacing the library via LD_PRELOAD or debian
alternatives without recompiling.
Both ATLAS and OpenBlas provide a subset of optimized lapack functions,
but they are optional. On Debian/Ubuntu you can install ATLAS lapack but
then you are also forced to use ATLAS blas.
I am not familiar with how relevant the optimized parts of lapack for
the general use case.
Post by Olivier Grisel
I observed a sparse leastsqr bug when linking scipy against the full
OpenBLAS under OSX that I could not reproduce under with the openblas
+ lapack combo shipped Ubuntu so there might be a difference. But that
could also be caused by a version / platform discrepancy between my
two setups.
what kind of a bug? wrong result or crash?
Here it is: https://github.com/scikit-learn/scikit-learn/issues/2986

I have not found the time to investigate yet.
Post by Julian Taylor
Which target did openblas use
when compiling on macos? or was it a dynamic build? (see the name of the
built static library)
I think I used target=NEHALEM that time (but not 100% sure).
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel
Sturla Molden
2014-03-28 22:32:23 UTC
Permalink
Post by Nathaniel Smith
I thought OpenBLAS is usually used with reference lapack?
It is.
Robert Kern
2014-03-28 19:57:30 UTC
Permalink
Of course, that's besides the point. Yes, pretty much everyone that likes
the BSD license of numpy will be okay with the minimal burdens the MPL2
lays on them. The problem is that we need to properly communicate that
license. The PyPI page is not adequate to that task, in my opinion. I have
no problem with the project distributing such binaries anywhere else. But
then, I have no problem with the project distributing MKL binaries
elsewhere either.
Post by Robert Kern
It's only a problem in that the binary will not be BSD, and we do need
to communicate that appropriately. It will contain a significant component
that is MPL2 licensed. The terms that force us to include the link to the
Eigen source that we used forces downstream redistributors of the binary to
do the same. Now, of all the copyleft licenses, this is certainly the most
friendly, but it is not BSD.
AFAICT, the only way redistributers could violate the MPL would be if they
unpacked our binary and deleted the license file. But this would also be a
violation of the BSD. The only difference in terms of requirements on
redistributors between MPL and BSD seems to be exactly *which* text you
include in your license file.
I don't know if Eigen is a good choice on technical grounds (or even a
possible one - has anyone ever actually compiled numpy against it?), but
this license thing just doesn't seem like an important issue to me, if the
alternative is not providing useful binaries.
-n
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Matthew Brett
2014-03-28 20:59:55 UTC
Permalink
Hi,
Post by Robert Kern
Of course, that's besides the point. Yes, pretty much everyone that likes
the BSD license of numpy will be okay with the minimal burdens the MPL2 lays
on them. The problem is that we need to properly communicate that license.
The PyPI page is not adequate to that task, in my opinion. I have no problem
with the project distributing such binaries anywhere else. But then, I have
no problem with the project distributing MKL binaries elsewhere either.
Post by Nathaniel Smith
Post by Robert Kern
It's only a problem in that the binary will not be BSD, and we do need
to communicate that appropriately. It will contain a significant component
that is MPL2 licensed. The terms that force us to include the link to the
Eigen source that we used forces downstream redistributors of the binary to
do the same. Now, of all the copyleft licenses, this is certainly the most
friendly, but it is not BSD.
AFAICT, the only way redistributers could violate the MPL would be if they
unpacked our binary and deleted the license file. But this would also be a
violation of the BSD. The only difference in terms of requirements on
redistributors between MPL and BSD seems to be exactly *which* text you
include in your license file.
I don't think even that would violate the MPL. The MPL says only that
we - the distributors of binary code from an MPL project - must do
this (3.1) "... inform recipients of the Executable Form how they can
obtain a copy of such Source Code Form". It doesn't say we have to
require the recipient to do the same [1], and it doesn't say that has
to be in our license. I don't think it can mean that, because
otherwise it would not make sense to say (in section 3.2) that we can
"sublicense it under different terms, provided that the license for
the Executable Form does not attempt to limit or alter the recipients'
rights in the Source Code Form under this License.". The unmodified
standard BSD license does not alter the recipients rights to the
source code form of Eigen.
Post by Robert Kern
Post by Nathaniel Smith
I don't know if Eigen is a good choice on technical grounds (or even a
possible one - has anyone ever actually compiled numpy against it?), but
this license thing just doesn't seem like an important issue to me, if the
alternative is not providing useful binaries.
Am I correct in thinking we are all agreeing that it would be OK to
distribute binary wheels for numpy from pypi, with compiled Eigen?

See you,

Matthew

[1] "It's important to understand that the condition to distribute
files under the MPL's terms only applies to the party that first
creates and distributes the Larger Work."
"https://www.gnu.org/licenses/license-list.html
Matthew Brett
2014-03-28 18:49:12 UTC
Permalink
Hi,
Post by Robert Kern
Post by Sturla Molden
Post by Matthew Brett
I see it should be possible to build a full blas and partial lapack
library with eigen [1] [2].
Eigen has a licensing issue as well, unfortunately, MPL2.
E.g. it requires recipients to be informed of the MPL requirements (cf.
impossible with pip install numpy).
That's not the relevant condition. That's easily taken care of by
including the MPL2 license text in the binary alongside numpy's BSD
license text. This is no different than numpy's BSD license itself,
which requires that the license text be included. It's not like people
can't distribute any MPL2 project on PyPI just because pip doesn't
print out the license before installing.
The extra-BSD conditions of the MPL2 are sections 3.1 and 3.2.
Thanks for thinking this through.

If I read you right, your opinion is that there would be no problem
including Eigen binaries with Numpy.

License here: http://www.mozilla.org/MPL/2.0/

Section 3.1 - if we distribute Eigen source, it has to be under the
MPL; so that doesn't apply to binaries.

Section 3.2 a) says that if we distribute binaries, we have to point
to the original source (e.g. link on pypi page)
Section 3.2 b) says we can distribute the executable code under any
license as long as it doesn't restrict the user's access to Eigen
source. I think that means there is no problem distributing the
binaries under the BSD license. Nathaniel's link is the relevant one
: http://www.mozilla.org/MPL/2.0/FAQ.html#distribute-my-binaries

So - is Eigen our best option for optimized blas / lapack binaries on
64 bit Windows?

Cheers,

Matthew
Sturla Molden
2014-03-28 19:01:03 UTC
Permalink
Post by Matthew Brett
So - is Eigen our best option for optimized blas / lapack binaries on
64 bit Windows?
Maybe not:

http://gcdart.blogspot.de/2013/06/fast-matrix-multiply-and-ml.html

With AVX the difference is possibly even larger.


Sturla
Nathaniel Smith
2014-03-28 19:23:34 UTC
Permalink
Post by Sturla Molden
Post by Matthew Brett
So - is Eigen our best option for optimized blas / lapack binaries on
64 bit Windows?
http://gcdart.blogspot.de/2013/06/fast-matrix-multiply-and-ml.html
With AVX the difference is possibly even larger.
But if we rule out closed-source BLAS, and we rule out OpenBLAS
because of our distrusting its accuracy, and we aren't going to
recompile ATLAS on every machine, then Eigen is the only library they
tested that is even an option for us.

It would be nice to see some comparison between our actual options --
Eigen, generically compiled ATLAS, anything else?
--
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
Sturla Molden
2014-03-28 18:56:09 UTC
Permalink
Post by Matthew Brett
Does anyone know how their performance compares to MKL or the
reference implementations?
http://eigen.tuxfamily.org/index.php?title=Benchmark

http://gcdart.blogspot.de/2013/06/fast-matrix-multiply-and-ml.html


Sturla
Matthew Brett
2014-03-28 19:28:04 UTC
Permalink
Hi,
Post by Sturla Molden
Post by Matthew Brett
Does anyone know how their performance compares to MKL or the
reference implementations?
http://eigen.tuxfamily.org/index.php?title=Benchmark
I don't know how relevant these are to our case. If I understand
correctly, the usual use of Eigen, as in these benchmarks, is to use
the Eigen headers to get fast code via C++ templating.

Because they know some of us need this, Eigen can also build a more
standard blas / lapack library to link against, but I presume this
will stop Eigen templating doing lots of clever tricks with the
operations, and therefore slow it down. Happy to be corrected though.
Post by Sturla Molden
http://gcdart.blogspot.de/2013/06/fast-matrix-multiply-and-ml.html
I think this page does not use the Eigen blas libraries either [1]

Also - this is on a massive linux machine ("48 core and 66GB RAM").
He's done a great job of showing what he did though.

The problem for us is:

We can't use MKL, ACML [2]
atlas is very difficult to compile on 64 bit windows, and has some
technical limitations on 64 bit [3]

So I think we're down to openblas and eigen for 64-bit windows. Does
anyone disagree?

Cheers,

Matthew

[1] : https://github.com/gcdart/dense-matrix-mult/blob/master/EIGEN/compile_eigen.sh
[2] : http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2013/12/ACML_June_24_2010_v2.pdf
[3] : http://math-atlas.sourceforge.net/atlas_install/node57.html
Andrea Gavana
2014-03-28 19:43:10 UTC
Permalink
Post by Sturla Molden
Post by Matthew Brett
Does anyone know how their performance compares to MKL or the
reference implementations?
http://eigen.tuxfamily.org/index.php?title=Benchmark
Very, very funny and twisted approach to legend-ordering-in-a-plot
approach. Maybe someone more knowledgeable can explain the ordering of the
labels in the plot legends, as after a while they don't make any sense -
neither lexicographically nor performance-wise.


Andrea.

"Imagination Is The Only Weapon In The War Against Reality."
http://www.infinity77.net

# ------------------------------------------------------------- #
def ask_mailing_list_support(email):

if mention_platform_and_version() and include_sample_app():
send_message(email)
else:
install_malware()
erase_hard_drives()
# ------------------------------------------------------------- #
Matthew Brett
2014-04-01 00:59:52 UTC
Permalink
Hi,

On Wed, Mar 26, 2014 at 11:34 AM, Julian Taylor
Post by Julian Taylor
Post by Olivier Grisel
Hi Carl,
I installed Python 2.7.6 64 bits on a windows server instance from
rackspace cloud and then ran get-pip.py and then could successfully
install the numpy and scipy wheel packages from your google drive
folder. I tested dot products and scipy.linalg.svd and they work as
expected.
Would it make sense to embed the blas and lapack header files as part
of this numpy wheel and make numpy.distutils.system_info return the
lib and include folder pointing to the embedded libopenblas.dll and
header files so has to make third party libraries directly buildable
against those?
as for using openblas by default in binary builds, no.
pthread openblas build is now fork safe which is great but it is still
not reliable enough for a default.
E.g. the current latest release 0.2.8 still has one crash bug on
dgemv[1], and wrong results zherk/zer2[2] and dgemv/cgemv[3].
git head has the former four fixed bug still has wrong results for cgemv.
I noticed the Carl was only getting three test failures on scipy - are
these related?

======================================================================
FAIL: test_decomp.test_eigh('general ', 6, 'F', True, False, False, (2, 4))
----------------------------------------------------------------------
Traceback (most recent call last):
File "D:\devel\py27\lib\site-packages\nose\case.py", line 197, in runTest
self.test(*self.arg)
File "D:\devel\py27\lib\site-packages\scipy\linalg\tests\test_decomp.py",
line 642, in eigenhproblem_general
assert_array_almost_equal(diag2_, ones(diag2_.shape[0]), DIGITS[dtype])
File "D:\devel\py27\lib\site-packages\numpy\testing\utils.py", line
811, in assert_array_almost_equal
header=('Arrays are not almost equal to %d decimals' % decimal))
File "D:\devel\py27\lib\site-packages\numpy\testing\utils.py", line
644, in assert_array_compare
raise AssertionError(msg)
AssertionError:
Arrays are not almost equal to 4 decimals

(mismatch 100.0%)
x: array([ 0., 0., 0.], dtype=float32)
y: array([ 1., 1., 1.])

======================================================================
FAIL: Tests for the minimize wrapper.
----------------------------------------------------------------------
Traceback (most recent call last):
File "D:\devel\py27\lib\site-packages\nose\case.py", line 197, in runTest
self.test(*self.arg)
File "D:\devel\py27\lib\site-packages\scipy\optimize\tests\test_optimize.py",
line 435, in test_minimize
self.test_powell(True)
File "D:\devel\py27\lib\site-packages\scipy\optimize\tests\test_optimize.py",
line 209, in test_powell
atol=1e-14, rtol=1e-7)
File "D:\devel\py27\lib\site-packages\numpy\testing\utils.py", line
1181, in assert_allclose
verbose=verbose, header=header)
File "D:\devel\py27\lib\site-packages\numpy\testing\utils.py", line
644, in assert_array_compare
raise AssertionError(msg)
AssertionError:
Not equal to tolerance rtol=1e-07, atol=1e-14

(mismatch 100.0%)
x: array([[ 0.75077639, -0.44156936, 0.47100962],
[ 0.75077639, -0.44156936, 0.48052496],
[ 1.50155279, -0.88313872, 0.95153458],...
y: array([[ 0.72949016, -0.44156936, 0.47100962],
[ 0.72949016, -0.44156936, 0.48052496],
[ 1.45898031, -0.88313872, 0.95153458],...

======================================================================
FAIL: Powell (direction set) optimization routine
----------------------------------------------------------------------
Traceback (most recent call last):
File "D:\devel\py27\lib\site-packages\nose\case.py", line 197, in runTest
self.test(*self.arg)
File "D:\devel\py27\lib\site-packages\scipy\optimize\tests\test_optimize.py",
line 209, in test_powell
atol=1e-14, rtol=1e-7)
File "D:\devel\py27\lib\site-packages\numpy\testing\utils.py", line
1181, in assert_allclose
verbose=verbose, header=header)
File "D:\devel\py27\lib\site-packages\numpy\testing\utils.py", line
644, in assert_array_compare
raise AssertionError(msg)
AssertionError:
Not equal to tolerance rtol=1e-07, atol=1e-14

(mismatch 100.0%)
x: array([[ 0.75077639, -0.44156936, 0.47100962],
[ 0.75077639, -0.44156936, 0.48052496],
[ 1.50155279, -0.88313872, 0.95153458],...
y: array([[ 0.72949016, -0.44156936, 0.47100962],
[ 0.72949016, -0.44156936, 0.48052496],
[ 1.45898031, -0.88313872, 0.95153458],...

----------------------------------------------------------------------
Ran 8940 tests in 143.892s
Post by Julian Taylor
Openblas is great if you do not have the patience to build ATLAS and
only use a restricted set of functionality and platforms you can easily
test.
I don't think it's possible to build ATLAS on Windows 64-bit at the
moment, and it would take a lot of work to make it build, and Clint W
has said he does not want to invest much time maintaining the Windows
build, so unless something changes, I think ATLAS is not a viable
option - for 64 bits at least.

Cheers,

Matthew
Julian Taylor
2014-04-03 12:56:39 UTC
Permalink
FYI, binaries linking openblas should add this patch in some way:
https://github.com/numpy/numpy/pull/4580

Cliffs: linking OpenBLAS prevents parallelization via threading or
multiprocessing.

just wasted a bunch of time figuring that out ... (though its well
documented in numerous stackoverflow questions, too bad none of them
reached us)
Post by Matthew Brett
Hi,
On Wed, Mar 26, 2014 at 11:34 AM, Julian Taylor
Post by Julian Taylor
Post by Olivier Grisel
Hi Carl,
I installed Python 2.7.6 64 bits on a windows server instance from
rackspace cloud and then ran get-pip.py and then could successfully
install the numpy and scipy wheel packages from your google drive
folder. I tested dot products and scipy.linalg.svd and they work as
expected.
Would it make sense to embed the blas and lapack header files as part
of this numpy wheel and make numpy.distutils.system_info return the
lib and include folder pointing to the embedded libopenblas.dll and
header files so has to make third party libraries directly buildable
against those?
as for using openblas by default in binary builds, no.
pthread openblas build is now fork safe which is great but it is still
not reliable enough for a default.
E.g. the current latest release 0.2.8 still has one crash bug on
dgemv[1], and wrong results zherk/zer2[2] and dgemv/cgemv[3].
git head has the former four fixed bug still has wrong results for cgemv.
I noticed the Carl was only getting three test failures on scipy - are
these related?
======================================================================
FAIL: test_decomp.test_eigh('general ', 6, 'F', True, False, False, (2, 4))
----------------------------------------------------------------------
File "D:\devel\py27\lib\site-packages\nose\case.py", line 197, in runTest
self.test(*self.arg)
File "D:\devel\py27\lib\site-packages\scipy\linalg\tests\test_decomp.py",
line 642, in eigenhproblem_general
assert_array_almost_equal(diag2_, ones(diag2_.shape[0]), DIGITS[dtype])
File "D:\devel\py27\lib\site-packages\numpy\testing\utils.py", line
811, in assert_array_almost_equal
header=('Arrays are not almost equal to %d decimals' % decimal))
File "D:\devel\py27\lib\site-packages\numpy\testing\utils.py", line
644, in assert_array_compare
raise AssertionError(msg)
Arrays are not almost equal to 4 decimals
(mismatch 100.0%)
x: array([ 0., 0., 0.], dtype=float32)
y: array([ 1., 1., 1.])
======================================================================
FAIL: Tests for the minimize wrapper.
----------------------------------------------------------------------
File "D:\devel\py27\lib\site-packages\nose\case.py", line 197, in runTest
self.test(*self.arg)
File "D:\devel\py27\lib\site-packages\scipy\optimize\tests\test_optimize.py",
line 435, in test_minimize
self.test_powell(True)
File "D:\devel\py27\lib\site-packages\scipy\optimize\tests\test_optimize.py",
line 209, in test_powell
atol=1e-14, rtol=1e-7)
File "D:\devel\py27\lib\site-packages\numpy\testing\utils.py", line
1181, in assert_allclose
verbose=verbose, header=header)
File "D:\devel\py27\lib\site-packages\numpy\testing\utils.py", line
644, in assert_array_compare
raise AssertionError(msg)
Not equal to tolerance rtol=1e-07, atol=1e-14
(mismatch 100.0%)
x: array([[ 0.75077639, -0.44156936, 0.47100962],
[ 0.75077639, -0.44156936, 0.48052496],
[ 1.50155279, -0.88313872, 0.95153458],...
y: array([[ 0.72949016, -0.44156936, 0.47100962],
[ 0.72949016, -0.44156936, 0.48052496],
[ 1.45898031, -0.88313872, 0.95153458],...
======================================================================
FAIL: Powell (direction set) optimization routine
----------------------------------------------------------------------
File "D:\devel\py27\lib\site-packages\nose\case.py", line 197, in runTest
self.test(*self.arg)
File "D:\devel\py27\lib\site-packages\scipy\optimize\tests\test_optimize.py",
line 209, in test_powell
atol=1e-14, rtol=1e-7)
File "D:\devel\py27\lib\site-packages\numpy\testing\utils.py", line
1181, in assert_allclose
verbose=verbose, header=header)
File "D:\devel\py27\lib\site-packages\numpy\testing\utils.py", line
644, in assert_array_compare
raise AssertionError(msg)
Not equal to tolerance rtol=1e-07, atol=1e-14
(mismatch 100.0%)
x: array([[ 0.75077639, -0.44156936, 0.47100962],
[ 0.75077639, -0.44156936, 0.48052496],
[ 1.50155279, -0.88313872, 0.95153458],...
y: array([[ 0.72949016, -0.44156936, 0.47100962],
[ 0.72949016, -0.44156936, 0.48052496],
[ 1.45898031, -0.88313872, 0.95153458],...
----------------------------------------------------------------------
Ran 8940 tests in 143.892s
Post by Julian Taylor
Openblas is great if you do not have the patience to build ATLAS and
only use a restricted set of functionality and platforms you can easily
test.
I don't think it's possible to build ATLAS on Windows 64-bit at the
moment, and it would take a lot of work to make it build, and Clint W
has said he does not want to invest much time maintaining the Windows
build, so unless something changes, I think ATLAS is not a viable
option - for 64 bits at least.
Olivier Grisel
2014-04-03 13:59:54 UTC
Permalink
Post by Julian Taylor
https://github.com/numpy/numpy/pull/4580
Cliffs: linking OpenBLAS prevents parallelization via threading or
multiprocessing.
just wasted a bunch of time figuring that out ... (though its well
documented in numerous stackoverflow questions, too bad none of them
reached us)
You mean because of the default CPU affinity stuff in the default
OpenBLAS? If we ship OpenBLAS with a windows binary of numpy / scipy
we can compile OpenBLAS with the NO_AFFINITY=1 flag to avoid the
issue.
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel
Matthew Brett
2014-04-06 07:18:04 UTC
Permalink
Hi,

On Wed, Mar 26, 2014 at 11:34 AM, Julian Taylor
Post by Julian Taylor
Post by Olivier Grisel
Hi Carl,
I installed Python 2.7.6 64 bits on a windows server instance from
rackspace cloud and then ran get-pip.py and then could successfully
install the numpy and scipy wheel packages from your google drive
folder. I tested dot products and scipy.linalg.svd and they work as
expected.
Would it make sense to embed the blas and lapack header files as part
of this numpy wheel and make numpy.distutils.system_info return the
lib and include folder pointing to the embedded libopenblas.dll and
header files so has to make third party libraries directly buildable
against those?
as for using openblas by default in binary builds, no.
pthread openblas build is now fork safe which is great but it is still
not reliable enough for a default.
E.g. the current latest release 0.2.8 still has one crash bug on
dgemv[1], and wrong results zherk/zer2[2] and dgemv/cgemv[3].
git head has the former four fixed bug still has wrong results for cgemv.
The not so old 0.2.8 also fixed whole bunch more crashes and wrong
result issues (crashes on QR, uninitialized data use in dgemm, ...).
None of the fixes received unit tests, so I am somewhat pessimistic that
it will improve, especially as the maintainer is dissertating (is that
the right word?) and most of the code is assembler code only few people
can write (it is simply not required anymore, we have code generators
and intrinsics for that).
Openblas is great if you do not have the patience to build ATLAS and
only use a restricted set of functionality and platforms you can easily
test.
Currently it is in my opinion not suitable for a general purpose library
like numpy.
I don't have any objections to adding get_info("openblas") if that does
not work yet. Patches welcome.
Julian - do you have any opinion on using gotoBLAS instead of OpenBLAS
for the Windows binaries?

Cheers,

Matthew
Sturla Molden
2014-04-06 18:47:21 UTC
Permalink
Post by Matthew Brett
Julian - do you have any opinion on using gotoBLAS instead of OpenBLAS
for the Windows binaries?
That is basically OpenBLAS too, except with more bugs and no AVX support.

Sturla
Matthew Brett
2014-04-06 18:59:35 UTC
Permalink
Hi,
Post by Sturla Molden
Post by Matthew Brett
Julian - do you have any opinion on using gotoBLAS instead of OpenBLAS
for the Windows binaries?
That is basically OpenBLAS too, except with more bugs and no AVX support.
I know that OpenBLAS is a fork of gotoBLAS2, but you said in another
thread that gotoBLAS2 was 'rock-solid'. If the bugs in gotoBLAS2 do
not in practice arise for default Windows builds, then it could be a
good option until OpenBLAS is more mature.

Put another way - does anyone know what bugs in gotoBLAS2 do arise for
Windows / Intel builds?

Cheers,

Matthew
Carl Kleffner
2014-04-06 19:47:43 UTC
Permalink
MKL BLAS LAPACK has issues as well:
http://software.intel.com/en-us/articles/intel-mkl-110-bug-fixes .
In case of OpenBLAS or GOTOBLAS what precisly is the problem you identify
as showstopper?

Regards

Carl
Post by Matthew Brett
Hi,
Post by Sturla Molden
Post by Matthew Brett
Julian - do you have any opinion on using gotoBLAS instead of OpenBLAS
for the Windows binaries?
That is basically OpenBLAS too, except with more bugs and no AVX support.
I know that OpenBLAS is a fork of gotoBLAS2, but you said in another
thread that gotoBLAS2 was 'rock-solid'. If the bugs in gotoBLAS2 do
not in practice arise for default Windows builds, then it could be a
good option until OpenBLAS is more mature.
Put another way - does anyone know what bugs in gotoBLAS2 do arise for
Windows / Intel builds?
Cheers,
Matthew
_______________________________________________
NumPy-Discussion mailing list
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Sturla Molden
2014-04-06 23:28:09 UTC
Permalink
<a
href="http://software.intel.com/en-us/articles/intel-mkl-110-bug-fixes">http://software.intel.com/en-us/articles/intel-mkl-110-bug-fixes</a>
.
In case of OpenBLAS or GOTOBLAS what precisly is the problem you identify
as showstopper?
For example:

https://github.com/xianyi/OpenBLAS/issues/340

However, the main problem is the quality of the projects:

- GotoBLAS2 is abandonware. After Goto went to Intel, all development has
ceased. Any bugs will not be fixed.

- GotoBLAS2 uses OpenMP on Posix. GOMP is not fork-safe (though not an
issue on Windows).

- OpenBLAS looks a bit like a one-man student project. Right now Zhang
Xianyi is writing his dissertation, so development has stopped.

- The SIMD code is written in inline assembly instead of using compiler
intrinsics for SIMD ops. This makes it hard to contribute as it is
insufficient just to know C.

- AT&T syntax in the inline assembly prevents us from building with MSVC.


Sturla

Sturla Molden
2014-04-06 23:07:42 UTC
Permalink
Post by Matthew Brett
Put another way - does anyone know what bugs in gotoBLAS2 do arise for
Windows / Intel builds?
http://www.openblas.net/Changelog.txt

There are some bug fixes for x86_64 here.

GotoBLAS (and GotoBLAS2) were the de facto BLAS on many HPC systems, and
are well proven. But there is no software without bugs.

I don't think there is a reason to prefer GotoBLAS2 to OpenBLAS. But you
could of course just cherry-pick all bugfixes for x86 and amd64 and leave
out the rest of the changes. Differences between OpenBLAS and GotoBLAS2 for
MIPS does not really matter for Windows...

Sturla
Olivier Grisel
2014-03-27 11:44:53 UTC
Permalink
Post by Olivier Grisel
Hi Carl,
I installed Python 2.7.6 64 bits on a windows server instance from
rackspace cloud and then ran get-pip.py and then could successfully
install the numpy and scipy wheel packages from your google drive
folder. I tested dot products and scipy.linalg.svd and they work as
expected.
Then I uncompressed your mingw toolchain in c:\mingw, put c:\mingw\bin
in my PATH and tried to build the scikit-learn git master with it,
building 'sklearn.__check_build._check_build' extension
compiling C sources
C compiler: gcc -DMS_WIN64 -O2 -msse -msse2 -Wall -Wstrict-prototypes
compile options: '-D__MSVCRT_VERSION__=0x0900
-Ic:\Python27\lib\site-packages\numpy\core\include
-Ic:\Python27\lib\site-packages\numpy\core\include -Ic:\Python2
7\include -Ic:\Python27\PC -c'
gcc -DMS_WIN64 -O2 -msse -msse2 -Wall -Wstrict-prototypes
-D__MSVCRT_VERSION__=0x0900
-Ic:\Python27\lib\site-packages\numpy\core\include
-Ic:\Python27\lib\site-
packages\numpy\core\include -Ic:\Python27\include -Ic:\Python27\PC -c
sklearn\__check_build\_check_build.c -o
build\temp.win-amd64-2.7\Release\sklearn\__check_b
uild\_check_build.o
Found executable c:\mingw\bin\gcc.exe
gcc -shared -Wl,-gc-sections -Wl,-s
build\temp.win-amd64-2.7\Release\sklearn\__check_build\_check_build.o
-Lc:\Python27\libs -Lc:\Python27\PCbuild\amd64 -Lbuild
\temp.win-amd64-2.7 -lpython27 -lmsvcr90 -o
build\lib.win-amd64-2.7\sklearn\__check_build\_check_build.pyd
undefined reference to `__imp__Py_NoneStruct'
undefined reference to `__imp__PyThreadState_Current'
undefined reference to `__imp_PyExc_ImportError'
build\temp.win-amd64-2.7\Release\sklearn\__check_build\_check_build.
o: bad reloc address 0x0 in section `.data'
collect2.exe: error: ld returned 1 exit status
error: Command "gcc -shared -Wl,-gc-sections -Wl,-s
build\temp.win-amd64-2.7\Release\sklearn\__check_build\_check_build.o
-Lc:\Python27\libs -Lc:\Python27\PCbui
ld\amd64 -Lbuild\temp.win-amd64-2.7 -lpython27 -lmsvcr90 -o
build\lib.win-amd64-2.7\sklearn\__check_build\_check_build.pyd" failed
with exit status 1
Ignore that, I had forgotten to copy the libpython17.a file in
c:\Python27\libs on that instance.

Building scikit-learn works with the static toolchain. I have failing
tests but those are probably not related to the toolchain.
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel
Continue reading on narkive:
Loading...