Discussion:
[Numpy-discussion] reorganizing numpy internal extensions (was: Re: Should we drop support for "one file" compilation mode?)
Nathaniel Smith
2015-10-06 18:30:53 UTC
Permalink
[splitting this off into a new thread]

On Tue, Oct 6, 2015 at 3:00 AM, David Cournapeau <***@gmail.com> wrote:
[...]
I also agree the current situation is not sustainable -- as we discussed
privately before, cythonizing numpy.core is made quite more complicated by
this. I have myself quite a few issues w/ cythonizing the other parts of
umath. I would also like to support the static link better than we do now
(do we know some static link users we can contact to validate our approach
?)
numpy.core.multiarray -> compilation units in numpy/core/src/multiarray/ +
statically link npymath
numpy.core.umath -> compilation units in numpy/core/src/umath + statically
link npymath/npysort + some shenanigans to use things in
numpy.core.multiarray
There are also shenanigans in the other direction - supposedly umath
is layered "above" multiarray, but in practice there are circular
dependencies (see e.g. np.set_numeric_ops).
I would suggest to have a more layered approach, to enable both 'normal'
build and static build, without polluting the public namespace too much.
This is an approach followed by most large libraries (e.g. MKL), and is
fairly flexible.
Concretely, we could start by putting more common functionalities (aka the
'core' library) into its own static library. The API would be considered
private to numpy (no stability guaranteed outside numpy), and every exported
symbol from that library would be decorated appropriately to avoid potential
clashes (e.g. '_npy_internal_').
I don't see why we need this multi-layered complexity, though.

npymath is a well-defined utility library that other people use, so
sure, it makes sense to keep that somewhat separate as a static
library (as discussed in the other thread).

Beyond that -- NumPy is really not a large library. multiarray is <50k
lines of code, and umath is only ~6k (!). And there's no particular
reason to keep them split up from the user point of view -- all their
functionality gets combined into the flat numpy namespace anyway. So
we *could* rewrite them as three libraries, with a "common core" that
then gets exported via two different wrapper libraries -- but it's
much simpler just to do

mv umath/* multiarray/
rmdir umath

and then make multiarray work the way we want. (After fixing up the
build system of course :-).)

-n
--
Nathaniel J. Smith -- http://vorpus.org
David Cournapeau
2015-10-06 18:52:11 UTC
Permalink
Post by Nathaniel Smith
[splitting this off into a new thread]
[...]
I also agree the current situation is not sustainable -- as we discussed
privately before, cythonizing numpy.core is made quite more complicated
by
this. I have myself quite a few issues w/ cythonizing the other parts of
umath. I would also like to support the static link better than we do now
(do we know some static link users we can contact to validate our
approach
?)
numpy.core.multiarray -> compilation units in numpy/core/src/multiarray/
+
statically link npymath
numpy.core.umath -> compilation units in numpy/core/src/umath +
statically
link npymath/npysort + some shenanigans to use things in
numpy.core.multiarray
There are also shenanigans in the other direction - supposedly umath
is layered "above" multiarray, but in practice there are circular
dependencies (see e.g. np.set_numeric_ops).
Indeed, I am not arguing about merging umath and multiarray.
Post by Nathaniel Smith
I would suggest to have a more layered approach, to enable both 'normal'
build and static build, without polluting the public namespace too much.
This is an approach followed by most large libraries (e.g. MKL), and is
fairly flexible.
Concretely, we could start by putting more common functionalities (aka
the
'core' library) into its own static library. The API would be considered
private to numpy (no stability guaranteed outside numpy), and every
exported
symbol from that library would be decorated appropriately to avoid
potential
clashes (e.g. '_npy_internal_').
I don't see why we need this multi-layered complexity, though.
For several reasons:

- when you want to cythonize either extension, it is much easier to
separate it as cython for CPython API, C for the rest.
- if numpy.core.multiarray.so is built as cython-based .o + a 'large' C
static library, it should become much simpler to support static link.
- maybe that's just personal, but I find the whole multiarray + umath
quite beyond manageable in terms of intertwined complexity. You may argue
it is not that big, and we all have different preferences in terms of
organization, but if I look at the binary size of multiarray + umath, it is
quite larger than the median size of the .so I have in my /usr/lib.

I am also hoping that splitting up numpy.core in separate elements that
communicate through internal APIs would make participating into numpy
easier.

We could also swap the argument: assuming it does not make the build more
complex, and that it does help static linking, why not doing it ?

David
Nathaniel Smith
2015-10-06 19:04:59 UTC
Permalink
Post by David Cournapeau
Post by Nathaniel Smith
[splitting this off into a new thread]
[...]
I also agree the current situation is not sustainable -- as we discussed
privately before, cythonizing numpy.core is made quite more complicated by
this. I have myself quite a few issues w/ cythonizing the other parts of
umath. I would also like to support the static link better than we do now
(do we know some static link users we can contact to validate our approach
?)
numpy.core.multiarray -> compilation units in numpy/core/src/multiarray/ +
statically link npymath
numpy.core.umath -> compilation units in numpy/core/src/umath + statically
link npymath/npysort + some shenanigans to use things in
numpy.core.multiarray
There are also shenanigans in the other direction - supposedly umath
is layered "above" multiarray, but in practice there are circular
dependencies (see e.g. np.set_numeric_ops).
Indeed, I am not arguing about merging umath and multiarray.
Oh, okay :-).
Post by David Cournapeau
Post by Nathaniel Smith
I would suggest to have a more layered approach, to enable both 'normal'
build and static build, without polluting the public namespace too much.
This is an approach followed by most large libraries (e.g. MKL), and is
fairly flexible.
Concretely, we could start by putting more common functionalities (aka the
'core' library) into its own static library. The API would be considered
private to numpy (no stability guaranteed outside numpy), and every exported
symbol from that library would be decorated appropriately to avoid potential
clashes (e.g. '_npy_internal_').
I don't see why we need this multi-layered complexity, though.
- when you want to cythonize either extension, it is much easier to
separate it as cython for CPython API, C for the rest.
I don't think this will help much, because I think we'll want to have
multiple cython files, and that we'll probably move individual
functions between being implemented in C and Cython (including utility
functions). So that means we need to solve the problem of mixing C and
Cython files inside a single library.

If you look at Stefan's PR:
https://github.com/numpy/numpy/pull/6408
it does solve most of these problems. It would help if Cython added a
few tweaks to officially support compiling multiple modules into one
.so, and I'm not sure whether the current code quite handles
initialization of the submodule correctly, but it's actually
surprisingly easy to make work.

(Obviously we won't want to go overboard here -- but the point of
removing the technical constraints is that then it frees us to pick
whatever arrangement makes the most sense, instead of deciding based
on what makes the build system and linker easiest.)
Post by David Cournapeau
- if numpy.core.multiarray.so is built as cython-based .o + a 'large' C
static library, it should become much simpler to support static link.
I don't see this at all, so I must be missing something? Either way
you build a bunch of .o files, and then you have to either combine
them into a shared library or combine them into a static library. Why
does pre-combining some of them into a static library make this
easier?
Post by David Cournapeau
- maybe that's just personal, but I find the whole multiarray + umath quite
beyond manageable in terms of intertwined complexity. You may argue it is
not that big, and we all have different preferences in terms of
organization, but if I look at the binary size of multiarray + umath, it is
quite larger than the median size of the .so I have in my /usr/lib.
The binary size isn't a good measure here -- most of that is the
bazillions of copies of slightly tweaked loops that we auto-generate,
which take up a lot of space but don't add much intertwined
complexity. (Though now that I think about it, my LOC estimate was
probably a bit low because cloc is probably ignoring those
autogeneration template files.)

We definitely could do a better job with our internal APIs -- I just
think that'll be easiest if everything is in the same directory so
there are minimal obstacles to rearranging and refactoring things.

Anyway, it sounds like we agree that the next step is to merge
multiarray and umath, so possibly we should worry about doing that and
then see what makes sense from there :-).

-n
--
Nathaniel J. Smith -- http://vorpus.org
Charles R Harris
2015-10-06 19:19:41 UTC
Permalink
Post by David Cournapeau
Post by Nathaniel Smith
[splitting this off into a new thread]
[...]
I also agree the current situation is not sustainable -- as we
discussed
Post by David Cournapeau
Post by Nathaniel Smith
privately before, cythonizing numpy.core is made quite more
complicated
Post by David Cournapeau
Post by Nathaniel Smith
by
this. I have myself quite a few issues w/ cythonizing the other parts
of
Post by David Cournapeau
Post by Nathaniel Smith
umath. I would also like to support the static link better than we do now
(do we know some static link users we can contact to validate our approach
?)
numpy.core.multiarray -> compilation units in
numpy/core/src/multiarray/
Post by David Cournapeau
Post by Nathaniel Smith
+
statically link npymath
numpy.core.umath -> compilation units in numpy/core/src/umath + statically
link npymath/npysort + some shenanigans to use things in
numpy.core.multiarray
There are also shenanigans in the other direction - supposedly umath
is layered "above" multiarray, but in practice there are circular
dependencies (see e.g. np.set_numeric_ops).
Indeed, I am not arguing about merging umath and multiarray.
Oh, okay :-).
Post by David Cournapeau
Post by Nathaniel Smith
I would suggest to have a more layered approach, to enable both
'normal'
Post by David Cournapeau
Post by Nathaniel Smith
build and static build, without polluting the public namespace too
much.
Post by David Cournapeau
Post by Nathaniel Smith
This is an approach followed by most large libraries (e.g. MKL), and
is
Post by David Cournapeau
Post by Nathaniel Smith
fairly flexible.
Concretely, we could start by putting more common functionalities (aka the
'core' library) into its own static library. The API would be
considered
Post by David Cournapeau
Post by Nathaniel Smith
private to numpy (no stability guaranteed outside numpy), and every exported
symbol from that library would be decorated appropriately to avoid potential
clashes (e.g. '_npy_internal_').
I don't see why we need this multi-layered complexity, though.
- when you want to cythonize either extension, it is much easier to
separate it as cython for CPython API, C for the rest.
I don't think this will help much, because I think we'll want to have
multiple cython files, and that we'll probably move individual
functions between being implemented in C and Cython (including utility
functions). So that means we need to solve the problem of mixing C and
Cython files inside a single library.
https://github.com/numpy/numpy/pull/6408
it does solve most of these problems. It would help if Cython added a
few tweaks to officially support compiling multiple modules into one
.so, and I'm not sure whether the current code quite handles
initialization of the submodule correctly, but it's actually
surprisingly easy to make work.
(Obviously we won't want to go overboard here -- but the point of
removing the technical constraints is that then it frees us to pick
whatever arrangement makes the most sense, instead of deciding based
on what makes the build system and linker easiest.)
Post by David Cournapeau
- if numpy.core.multiarray.so is built as cython-based .o + a 'large' C
static library, it should become much simpler to support static link.
I don't see this at all, so I must be missing something? Either way
you build a bunch of .o files, and then you have to either combine
them into a shared library or combine them into a static library. Why
does pre-combining some of them into a static library make this
easier?
Post by David Cournapeau
- maybe that's just personal, but I find the whole multiarray + umath
quite
Post by David Cournapeau
beyond manageable in terms of intertwined complexity. You may argue it is
not that big, and we all have different preferences in terms of
organization, but if I look at the binary size of multiarray + umath, it
is
Post by David Cournapeau
quite larger than the median size of the .so I have in my /usr/lib.
The binary size isn't a good measure here -- most of that is the
bazillions of copies of slightly tweaked loops that we auto-generate,
which take up a lot of space but don't add much intertwined
complexity. (Though now that I think about it, my LOC estimate was
probably a bit low because cloc is probably ignoring those
autogeneration template files.)
We definitely could do a better job with our internal APIs -- I just
think that'll be easiest if everything is in the same directory so
there are minimal obstacles to rearranging and refactoring things.
Anyway, it sounds like we agree that the next step is to merge
multiarray and umath, so possibly we should worry about doing that and
then see what makes sense from there :-).
What about removing the single file build? That seems somewhat orthogonal
to this discussion. Would someone explain to me the advantages of the
single file build for static linking, apart from possible doing a better
job of hiding symbols? If symbols are the problem, it there not a solution
we could implement?

Chuck
Nathaniel Smith
2015-10-08 07:10:17 UTC
Permalink
On Tue, Oct 6, 2015 at 12:19 PM, Charles R Harris
[...]
Post by Nathaniel Smith
Anyway, it sounds like we agree that the next step is to merge
multiarray and umath, so possibly we should worry about doing that and
then see what makes sense from there :-).
What about removing the single file build? That seems somewhat orthogonal to
this discussion.
We seem to also have consensus about removing the single file build,
but yeah, it's orthogonal -- notice the changed subject line in this
subthread :-).
Would someone explain to me the advantages of the single
file build for static linking, apart from possible doing a better job of
hiding symbols? If symbols are the problem, it there not a solution we could
implement?
Hiding symbols is the only advantage that I'm aware of, and as noted
in the other thread there do exist other solutions. The only thing is
that we can't be absolutely certain these tools will work until
someone who needs static builds actually tries it -- the tools
definitely exist on regular linux, but IIUC the people who need static
builds are generally on really weird architectures that we can't test
ourselves. Or for all I know the weird architectures have finally
added shared linking and no-one uses static builds anymore. I think we
need to just try dropping it and see.

-n
--
Nathaniel J. Smith -- http://vorpus.org
Daniele Nicolodi
2015-10-08 10:44:24 UTC
Permalink
Hello,

sorry for replying in the wrong thread, but I don't find an appropriate
message to reply to in the original one.
Post by Nathaniel Smith
Hiding symbols is the only advantage that I'm aware of, and as noted
in the other thread there do exist other solutions.
Indeed, and those are way easier than maintaining the single file build.
Post by Nathaniel Smith
The only thing is
that we can't be absolutely certain these tools will work until
someone who needs static builds actually tries it -- the tools
definitely exist on regular linux, but IIUC the people who need static
builds are generally on really weird architectures that we can't test
ourselves. Or for all I know the weird architectures have finally
added shared linking and no-one uses static builds anymore. I think we
need to just try dropping it and see.
I don't really see how building from a single source file or multiple
source files affects the linking of a static library. Can you be more
precise about what the problems are? The only thing that I may think of
is instructing distutils to do the right thing, but that should not be a
stopper.

Cheers,
Daniele
David Cournapeau
2015-10-08 13:30:03 UTC
Permalink
Post by David Cournapeau
Post by Nathaniel Smith
[splitting this off into a new thread]
[...]
I also agree the current situation is not sustainable -- as we
discussed
Post by David Cournapeau
Post by Nathaniel Smith
privately before, cythonizing numpy.core is made quite more
complicated
Post by David Cournapeau
Post by Nathaniel Smith
by
this. I have myself quite a few issues w/ cythonizing the other parts
of
Post by David Cournapeau
Post by Nathaniel Smith
umath. I would also like to support the static link better than we do now
(do we know some static link users we can contact to validate our approach
?)
numpy.core.multiarray -> compilation units in
numpy/core/src/multiarray/
Post by David Cournapeau
Post by Nathaniel Smith
+
statically link npymath
numpy.core.umath -> compilation units in numpy/core/src/umath + statically
link npymath/npysort + some shenanigans to use things in
numpy.core.multiarray
There are also shenanigans in the other direction - supposedly umath
is layered "above" multiarray, but in practice there are circular
dependencies (see e.g. np.set_numeric_ops).
Indeed, I am not arguing about merging umath and multiarray.
Oh, okay :-).
Post by David Cournapeau
Post by Nathaniel Smith
I would suggest to have a more layered approach, to enable both
'normal'
Post by David Cournapeau
Post by Nathaniel Smith
build and static build, without polluting the public namespace too
much.
Post by David Cournapeau
Post by Nathaniel Smith
This is an approach followed by most large libraries (e.g. MKL), and
is
Post by David Cournapeau
Post by Nathaniel Smith
fairly flexible.
Concretely, we could start by putting more common functionalities (aka the
'core' library) into its own static library. The API would be
considered
Post by David Cournapeau
Post by Nathaniel Smith
private to numpy (no stability guaranteed outside numpy), and every exported
symbol from that library would be decorated appropriately to avoid potential
clashes (e.g. '_npy_internal_').
I don't see why we need this multi-layered complexity, though.
- when you want to cythonize either extension, it is much easier to
separate it as cython for CPython API, C for the rest.
I don't think this will help much, because I think we'll want to have
multiple cython files, and that we'll probably move individual
functions between being implemented in C and Cython (including utility
functions). So that means we need to solve the problem of mixing C and
Cython files inside a single library.
Separating the pure C code into static lib is the simple way of achieving
the same goal. Essentially, you write:

# implemented in npyinternal.a
_npy_internal_foo(....)

# implemented in merged_multiarray_umath.pyx
cdef PyArray_Foo(...):
# use _npy_internal_foo()

then our merged_multiarray_umath.so is built by linking the .pyx and the
npyinternal.a together. IOW, the static link is internal.

Going through npyinternal.a instead of just linking .o from pure C and
Cython together gives us the following:

1. the .a can just use normal linking strategies instead of the awkward
capsule thing. Those are easy to get wrong when using cython as you may end
up with multiple internal copies of the wrapped object inside capsule,
causing hard to track bugs (this is what we wasted most of the time on w/
Stefan and Kurt during ds4ds)
2. the only public symbols in .a are the ones needed by the cython
wrapping, and since those are decorated with npy_internal, clashes are
unlikely to happen
3. since most of the code is already in .a internally, supporting the
static linking should be simpler since the only difference is how you
statically link the cython-generated code. Because of 1, you are also less
likely to cause nasty surprises when putting everything together.

When you cythonize umath/multiarray, you need to do most of the underlying
work anyway

I don't really care if the files are in the same directory or not, we can
keep things as they are now.

David
Julian Taylor
2015-10-08 17:06:09 UTC
Permalink
On Tue, Oct 6, 2015 at 11:52 AM, David Cournapeau
Post by David Cournapeau
Post by Nathaniel Smith
[splitting this off into a new thread]
On Tue, Oct 6, 2015 at 3:00 AM, David Cournapeau
[...]
I also agree the current situation is not sustainable -- as we
discussed
Post by David Cournapeau
Post by Nathaniel Smith
privately before, cythonizing numpy.core is made quite more
complicated
Post by David Cournapeau
Post by Nathaniel Smith
by
this. I have myself quite a few issues w/ cythonizing the
other parts of
Post by David Cournapeau
Post by Nathaniel Smith
umath. I would also like to support the static link better
than we do
Post by David Cournapeau
Post by Nathaniel Smith
now
(do we know some static link users we can contact to validate our
approach
?)
numpy.core.multiarray -> compilation units in
numpy/core/src/multiarray/
Post by David Cournapeau
Post by Nathaniel Smith
+
statically link npymath
numpy.core.umath -> compilation units in numpy/core/src/umath +
statically
link npymath/npysort + some shenanigans to use things in
numpy.core.multiarray
There are also shenanigans in the other direction - supposedly umath
is layered "above" multiarray, but in practice there are circular
dependencies (see e.g. np.set_numeric_ops).
Indeed, I am not arguing about merging umath and multiarray.
Oh, okay :-).
Post by David Cournapeau
Post by Nathaniel Smith
I would suggest to have a more layered approach, to enable both 'normal'
build and static build, without polluting the public namespace too much.
This is an approach followed by most large libraries (e.g. MKL), and is
fairly flexible.
Concretely, we could start by putting more common functionalities (aka
the
'core' library) into its own static library. The API would be considered
private to numpy (no stability guaranteed outside numpy), and every
exported
symbol from that library would be decorated appropriately to avoid
potential
clashes (e.g. '_npy_internal_').
I don't see why we need this multi-layered complexity, though.
- when you want to cythonize either extension, it is much easier to
separate it as cython for CPython API, C for the rest.
I don't think this will help much, because I think we'll want to have
multiple cython files, and that we'll probably move individual
functions between being implemented in C and Cython (including utility
functions). So that means we need to solve the problem of mixing C and
Cython files inside a single library.
Separating the pure C code into static lib is the simple way of
# implemented in npyinternal.a
_npy_internal_foo(....)
# implemented in merged_multiarray_umath.pyx
# use _npy_internal_foo()
then our merged_multiarray_umath.so is built by linking the .pyx and the
npyinternal.a together. IOW, the static link is internal.
Going through npyinternal.a instead of just linking .o from pure C and
1. the .a can just use normal linking strategies instead of the
awkward capsule thing. Those are easy to get wrong when using cython as
you may end up with multiple internal copies of the wrapped object
inside capsule, causing hard to track bugs (this is what we wasted most
of the time on w/ Stefan and Kurt during ds4ds)
2. the only public symbols in .a are the ones needed by the cython
wrapping, and since those are decorated with npy_internal, clashes are
unlikely to happen
3. since most of the code is already in .a internally, supporting the
static linking should be simpler since the only difference is how you
statically link the cython-generated code. Because of 1, you are also
less likely to cause nasty surprises when putting everything together.
I don't see why static libraries for internals are discussed at all?
There is not much difference between an .a (archive) file and an .o
(object) file. What you call a static library is just a collection of
object files with an index slapped on top for faster lookup.
Whether a symbol is exported or not is defined in the object file, not
the archive file, so in this regard static library of collection of .o
files makes no difference.
So our current system also produces a library, the only thing thats
"missing" is bundling it into an archive via ar cru *.o

I also don't see how pycapsule plays a role in this. You don't need
pycapsule to link a bunch of object files together.

So for me the issue is simply, what is easier with distutils:
get the list of object files to link against the cython file or first
create a static library from the list of object files and link that
against the cython object.
I don't think either way should be particular hard. So there is not
really much to discuss. Do whatever is easier or results in nicer code.


As for adding cython to numpy, I'd start with letting a cython file
provide the multiarraymodule init function with all regular numpy object
files linked into that thing. Then we have a pyx file with minimal bloat
to get started and should also be independent of merging umath (which
I'm in favour for).
When that single pyx module file gets too large probably concatenating
multiple files together could work until cython supports a splut
util/user-code build.
Nathaniel Smith
2015-10-08 19:47:56 UTC
Permalink
On Oct 8, 2015 06:30, "David Cournapeau" <***@gmail.com> wrote:
[...]
Post by David Cournapeau
Separating the pure C code into static lib is the simple way of achieving
# implemented in npyinternal.a
_npy_internal_foo(....)
# implemented in merged_multiarray_umath.pyx
# use _npy_internal_foo()
then our merged_multiarray_umath.so is built by linking the .pyx and the
npyinternal.a together. IOW, the static link is internal.
Post by David Cournapeau
Going through npyinternal.a instead of just linking .o from pure C and
1. the .a can just use normal linking strategies instead of the awkward
capsule thing. Those are easy to get wrong when using cython as you may end
up with multiple internal copies of the wrapped object inside capsule,
causing hard to track bugs (this is what we wasted most of the time on w/
Stefan and Kurt during ds4ds)

Check out Stéfan's branch -- it just uses regular linking to mix cython and
C. I know what you mean about the capsule thing, and I think we shouldn't
use it at all. With a few tweaks, you can treat cython-generated .c files
just like regular .c files (except for the main module file, which if we
port it to cython then we just compile like a regular cython file).

-n
David Cournapeau
2015-10-08 20:07:05 UTC
Permalink
Post by David Cournapeau
[...]
Post by David Cournapeau
Separating the pure C code into static lib is the simple way of
# implemented in npyinternal.a
_npy_internal_foo(....)
# implemented in merged_multiarray_umath.pyx
# use _npy_internal_foo()
then our merged_multiarray_umath.so is built by linking the .pyx and the
npyinternal.a together. IOW, the static link is internal.
Post by David Cournapeau
Going through npyinternal.a instead of just linking .o from pure C and
1. the .a can just use normal linking strategies instead of the awkward
capsule thing. Those are easy to get wrong when using cython as you may end
up with multiple internal copies of the wrapped object inside capsule,
causing hard to track bugs (this is what we wasted most of the time on w/
Stefan and Kurt during ds4ds)
Check out Stéfan's branch -- it just uses regular linking to mix cython
and C.
I know, we worked on this together after all ;)

My suggested organisation is certainly not mandatory, I was not trying to
claim otherwise, sorry if that was unclear.

At that point, I guess the consensus is that I have to prove my suggestion
is useful. I will take a few more hours to submit a PR with the umath
conversion (maybe merging w/ the work from Stéfan). I discovered on my
flight back that you can call PyModule_Init multiple times for a given
module, which is useful while we do the transition C->Cython for the module
initialization (it is not documented as possible, so I would not to rely on
it for long either).

David
Nathaniel Smith
2015-10-09 06:14:55 UTC
Permalink
Post by David Cournapeau
[...]
Post by David Cournapeau
Separating the pure C code into static lib is the simple way of
# implemented in npyinternal.a
_npy_internal_foo(....)
# implemented in merged_multiarray_umath.pyx
# use _npy_internal_foo()
then our merged_multiarray_umath.so is built by linking the .pyx and the
npyinternal.a together. IOW, the static link is internal.
Going through npyinternal.a instead of just linking .o from pure C and
1. the .a can just use normal linking strategies instead of the awkward
capsule thing. Those are easy to get wrong when using cython as you may end
up with multiple internal copies of the wrapped object inside capsule,
causing hard to track bugs (this is what we wasted most of the time on w/
Stefan and Kurt during ds4ds)
Check out Stéfan's branch -- it just uses regular linking to mix cython
and C.
I know, we worked on this together after all ;)
My suggested organisation is certainly not mandatory, I was not trying to
claim otherwise, sorry if that was unclear.
At that point, I guess the consensus is that I have to prove my suggestion
is useful. I will take a few more hours to submit a PR with the umath
conversion (maybe merging w/ the work from Stéfan).
Okay! Still not sure what capsules have to do with anything, but I
guess the PR will either make it clear or else make it clear that it
doesn't matter :-).
Post by David Cournapeau
I discovered on my
flight back that you can call PyModule_Init multiple times for a given
module, which is useful while we do the transition C->Cython for the module
initialization (it is not documented as possible, so I would not to rely on
it for long either).
Oh, right, Stefan mentioned something about this... PyModule_Init is a
python-2-only thing, so whatever it does now is what it will do
forever and ever amen. But I can't think of any good reason to call it
twice -- if your goal is just to get a reference to the new module,
then once PyModule_Init has run once, you can just run
PyImport_ImportModule (assuming you know your fully-qualified module
name).

-n
--
Nathaniel J. Smith -- http://vorpus.org
Loading...