Travis Oliphant
2015-09-13 00:01:53 UTC
Hey all,
To the NumPy list only, I'll at least give the highlights of the surgical
approach I would like to get someone to work on -- I can help mentor and
guide. These are just the highlights, but it should give someone familiar
with the code the general gist. There are some details to work out, of
course, but it could be done.
It may be very similar to what Nathaniel is contemplating --- except I
think breaking the ABI is the only way to really do this --- could be wrong
but I'm not wiling to risk *not* just breaking the ABI.
1) Create a new meta-type in C (call it dtype)
2) Create Python Classes (in C) that are instances of this meta-type for
each "kind" of data-type
3) Make PyArray_Descr * be a reference to one of these new objects (which
can be built either in C or Python) and should be published outside NumPy
as well.
4) Remove most of the "per-type function calls" in PyArray_ArrFuncs ---
instead replacing those with the Generalized Ufunc equivalents and expand
the capability of Generalized Ufuncs
5) Keep the Array Scalar Types but change them so that they also use the
dtype meta-type as their foundation and mixin an array-methods type.
Also, have these be in a separate project from NumPy itself.
6) The current void* would be replaced with real Python classes instead of
structured arrays being shoved through a single data-type.
7) The documented ways to spell a dtype would be reduced --- but backwards
compatibility would be preserved.
8) Make sure Numba can create these Descriptor objects with Ahead of Time
Compilation and start to move code of NumPy to Numba
9) Ensure the Generalized Ufunc framework can take the data-type as an
argument so that *all* data-types can participate in the general
multi-method approach.
There is more to it, but that is the basic idea. Please forgive me if I
can't respond to any feedback from the list in a timely way. I will as I
can.
-Travis
To the NumPy list only, I'll at least give the highlights of the surgical
approach I would like to get someone to work on -- I can help mentor and
guide. These are just the highlights, but it should give someone familiar
with the code the general gist. There are some details to work out, of
course, but it could be done.
It may be very similar to what Nathaniel is contemplating --- except I
think breaking the ABI is the only way to really do this --- could be wrong
but I'm not wiling to risk *not* just breaking the ABI.
1) Create a new meta-type in C (call it dtype)
2) Create Python Classes (in C) that are instances of this meta-type for
each "kind" of data-type
3) Make PyArray_Descr * be a reference to one of these new objects (which
can be built either in C or Python) and should be published outside NumPy
as well.
4) Remove most of the "per-type function calls" in PyArray_ArrFuncs ---
instead replacing those with the Generalized Ufunc equivalents and expand
the capability of Generalized Ufuncs
5) Keep the Array Scalar Types but change them so that they also use the
dtype meta-type as their foundation and mixin an array-methods type.
Also, have these be in a separate project from NumPy itself.
6) The current void* would be replaced with real Python classes instead of
structured arrays being shoved through a single data-type.
7) The documented ways to spell a dtype would be reduced --- but backwards
compatibility would be preserved.
8) Make sure Numba can create these Descriptor objects with Ahead of Time
Compilation and start to move code of NumPy to Numba
9) Ensure the Generalized Ufunc framework can take the data-type as an
argument so that *all* data-types can participate in the general
multi-method approach.
There is more to it, but that is the basic idea. Please forgive me if I
can't respond to any feedback from the list in a timely way. I will as I
can.
-Travis
--
*Travis Oliphant*
*Co-founder and CEO*
@teoliphant
512-222-5440
http://www.continuum.io
*Travis Oliphant*
*Co-founder and CEO*
@teoliphant
512-222-5440
http://www.continuum.io