[Numpy-discussion] Multidimension array access in C via Python API

Discussion:

mpc

2016-04-04 17:35:45 UTC

Hello,

is there a C-API function for numpy that can implement Python's
multidimensional indexing?

For example, if I had a 2d array:

PyArrayObject * M;

and an index

int i;

how do I extract the i-th row M[i,:] or i-th column M[:,i]?

Ideally it would be great if it returned another PyArrayObject* object (not
a newly allocated one, but whose data will point to the correct memory
locations of M).

I've searched everywhere in the API documentation, Google, and SO, but no
luck.

Any help is greatly appreciated.

Thank you.

-Matthew

--
View this message in context: http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.

Eric Moore

2016-04-04 19:44:51 UTC

Permalink

/* obj[ind] */
PyObject* DoIndex(PyObject* obj, int ind)
{
PyObject *oind, *ret;
oind = PyLong_FromLong(ind);
if (!oind) {
return NULL;
}
ret = PyObject_GetItem(obj, oind);
Py_DECREF(oind);
return ret;
}

/* obj[inds[0], inds[1], ... inds[n_ind-1]] */
PyObject* DoMultiIndex(PyObject* obj, int *inds, int n_ind)
{
PyObject *ret, *oind, *temp;
oind = PyTuple_New(n_ind);
if (!oind)
return NULL;

for (int k = 0; k < n_ind; ++k)
{
temp = PyLong_FromLong(inds[k]);
if (!temp)
Py_DECREF(oind);
PyTuple_SET_ITEM(oind, k, temp);
}
ret = PyObject_GetItem(obj, oind);
Py_DECREF(oind);
return ret;
}

/* obj[b:e:step] */
PyObject* DoSlice(PyObject* obj, int b, int e, int step)
{
PyObject *oind, *ret, *ob, *oe, *ostep;
ob = PyLong_FromLong(b);
if (!ob)
return NULL;
oe = PyLong_FromLong(e);
if (!oe) {
Py_DECREF(ob);
return NULL;
}
ostep = PyLong_FromLong(step);
if (!ostep) {
Py_DECREF(ob);
Py_DECREF(oe);
return NULL;
}
oind = PySlice_New(ob, oe, ostep);
Py_DECREF(ob);
Py_DECREF(oe);
Py_DECREF(ostep);

if (!oind)
return NULL;

ret = PyObject_GetItem(obj, oind);
Py_DECREF(oind);
return ret;
}

-Eric

Post by mpc
Hello,
is there a C-API function for numpy that can implement Python's
multidimensional indexing?
PyArrayObject * M;
and an index
int i;
how do I extract the i-th row M[i,:] or i-th column M[:,i]?
Ideally it would be great if it returned another PyArrayObject* object (not
a newly allocated one, but whose data will point to the correct memory
locations of M).
I've searched everywhere in the API documentation, Google, and SO, but no
luck.
Any help is greatly appreciated.
Thank you.
-Matthew
--
http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion

mpc

2016-04-04 19:59:17 UTC

Permalink

Thanks for responding.

It looks you made/found these yourself since I can't find anything like this
in the API. I can't believe it isn't, so convenient!

By the way, from what I understand, the ':' is represented as
*PySlice_New(NULL, NULL, NULL) *in the C API when accessing by index,
correct?

Therefore the final result will be something like:

*PyObject* first_column_tuple = PyTuple_New(2);
PyTuple_SET_ITEM(first_column_tuple, 0, PySlice_New(NULL, NULL, NULL));
PyTuple_SET_ITEM(first_column_tuple, 1, PyInt_FromLong(0));
PyObject* first_column_buffer = PyObject_GetItem(src_buffer,
first_column_tuple);
*

--
View this message in context: http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710p42715.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.

Eric Moore

2016-04-04 21:14:27 UTC

Permalink

Yes, PySlice_New(NULL, NULL, NULL) is the same as ':'. Depending on what
exactly you want to do with the column once you've extracted it, this may
not be the best way to do it. Are you absolutely certain that you actually
need a PyArrayObject that points to the column?

Eric

Post by mpc
Thanks for responding.
It looks you made/found these yourself since I can't find anything like this
in the API. I can't believe it isn't, so convenient!
By the way, from what I understand, the ':' is represented as
*PySlice_New(NULL, NULL, NULL) *in the C API when accessing by index,
correct?
*PyObject* first_column_tuple = PyTuple_New(2);
PyTuple_SET_ITEM(first_column_tuple, 0, PySlice_New(NULL, NULL, NULL));
PyTuple_SET_ITEM(first_column_tuple, 1, PyInt_FromLong(0));
PyObject* first_column_buffer = PyObject_GetItem(src_buffer,
first_column_tuple);
*
--
http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710p42715.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion

mpc

2016-04-04 20:28:20 UTC

Permalink

I think that I do, since I intend to do array specific operations on the
resulting column of data.

e.g:

*PyArray_Min*
*PyArray_Max*
which require a PyArrayObject argument

I also plan to use *PyArray_Where* to find individual point locations in
data columns x,y,z within a 3D range, but it doesn't look like it needs
PyArrayObject.

--
View this message in context: http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710p42717.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.

Nathaniel Smith

2016-04-04 22:57:24 UTC

Permalink

If this is what your code looks like, then I strongly suspect you'll be
better off writing it in cython or even just plain python. The above code
won't run any faster than the equivalent code in python, but it's much
harder to read and write...

-n

mpc

2016-04-05 15:39:39 UTC

Permalink

This is the reason I'm doing this in the first place, because I made a pure
python version but it runs really slow for larger data sets, so I'm
basically rewriting the same function but using the Python and Numpy C API,
but if you're saying it won't run any faster then maybe I'm going at it the
wrong way. (Why use the C function version if it's the same speed anyway?)

You're suggesting perhaps a cython approach, or perhaps a strictly C/C++
approach given the raw data?

--
View this message in context: http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710p42719.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.

Eric Moore

2016-04-05 17:17:42 UTC

Permalink

Its difficult to say why your code is slow without seeing it. i.e. are you
generating large temporaries? Or doing loops in python that can be pushed
down to C via vectorizing? It may or may not be necessary to leave python
to get things to run fast enough.

-Eric

Post by mpc
This is the reason I'm doing this in the first place, because I made a pure
python version but it runs really slow for larger data sets, so I'm
basically rewriting the same function but using the Python and Numpy C API,
but if you're saying it won't run any faster then maybe I'm going at it the
wrong way. (Why use the C function version if it's the same speed anyway?)
You're suggesting perhaps a cython approach, or perhaps a strictly C/C++
approach given the raw data?
--
http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710p42719.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion

mpc

2016-04-05 16:48:14 UTC

Permalink

The idea is that I want to thin a large 2D buffer of x,y,z points to a given
resolution by dividing the data into equal sized "cubes" (i.e. resolution is
number of cubes along each axis) and averaging the points inside each cube
(if any).

* # Fill up buffer data for demonstration purposes with initial buffer of
size 10,000,000 to reduce to 1,000,000
size = 10000000
buffer = np.ndarray(shape=(size,3), dtype=np.float)
# fill it up
buffer[:, 0] = np.random.ranf(size)
buffer[:, 1] = np.random.ranf(size)
buffer[:, 2] = np.random.ranf(size)

# Create result buffer to size of cubed resolution (i.e. 100 ^ 3 =
1,000,000)
resolution = 100
thinned_buffer = np.ndarray(shape=(resolution ** 3,3), dtype=np.float)

# Trying to convert the following into C to speed it up
x_buffer = buffer[:, 0]
y_buffer = buffer[:, 1]
z_buffer = buffer[:, 2]
min_x = x_buffer.min()
max_x = x_buffer.max()
min_y = y_buffer.min()
max_y = y_buffer.max()
min_z = z_buffer.min()
max_z = z_buffer.max()
z_block = (max_z - min_z) / resolution
x_block = (max_x - min_x) / resolution
y_block = (max_y - min_y) / resolution

current_idx = 0
x_idx = min_x
while x_idx < max_x:
y_idx = min_y
while y_idx < max_y:
z_idx = min_z
while z_idx < max_z:
inside_block_points = np.where((x_buffer >= x_idx) &
(x_buffer <=
x_idx + x_block) &
(y_buffer >=
y_idx) &
(y_buffer <=
y_idx + y_block) &
(z_buffer >=
z_idx) &
(z_buffer <=
z_idx + z_block))
if inside_block_points[0].size > 0:
mean_point = buffer[inside_block_points[0]].mean(axis=0)
thinned_buffer[current_idx] = mean_point
current_idx += 1
z_idx += z_block
y_idx += y_block
x_idx += x_block
return thin_buffer
*

--
View this message in context: http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710p42726.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.

Benjamin Root

2016-04-05 17:56:28 UTC

Permalink

You might do better using scipy.spatial. It has very useful data structures
for handling spatial coordinates. I am not exactly sure how to use them for
this specific problem (not a domain expert), but I would imagine that the
QHull wrappers there might give you some useful tools.

Ben Root

Post by mpc
The idea is that I want to thin a large 2D buffer of x,y,z points to a given
resolution by dividing the data into equal sized "cubes" (i.e. resolution is
number of cubes along each axis) and averaging the points inside each cube
(if any).
* # Fill up buffer data for demonstration purposes with initial buffer of
size 10,000,000 to reduce to 1,000,000
size = 10000000
buffer = np.ndarray(shape=(size,3), dtype=np.float)
# fill it up
buffer[:, 0] = np.random.ranf(size)
buffer[:, 1] = np.random.ranf(size)
buffer[:, 2] = np.random.ranf(size)
# Create result buffer to size of cubed resolution (i.e. 100 ^ 3 =
1,000,000)
resolution = 100
thinned_buffer = np.ndarray(shape=(resolution ** 3,3), dtype=np.float)
# Trying to convert the following into C to speed it up
x_buffer = buffer[:, 0]
y_buffer = buffer[:, 1]
z_buffer = buffer[:, 2]
min_x = x_buffer.min()
max_x = x_buffer.max()
min_y = y_buffer.min()
max_y = y_buffer.max()
min_z = z_buffer.min()
max_z = z_buffer.max()
z_block = (max_z - min_z) / resolution
x_block = (max_x - min_x) / resolution
y_block = (max_y - min_y) / resolution
current_idx = 0
x_idx = min_x
y_idx = min_y
z_idx = min_z
inside_block_points = np.where((x_buffer >= x_idx) &
(x_buffer <=
x_idx + x_block) &
(y_buffer >=
y_idx) &
(y_buffer <=
y_idx + y_block) &
(z_buffer >=
z_idx) &
(z_buffer <=
z_idx + z_block))
mean_point =
buffer[inside_block_points[0]].mean(axis=0)
thinned_buffer[current_idx] = mean_point
current_idx += 1
z_idx += z_block
y_idx += y_block
x_idx += x_block
return thin_buffer
*
--
http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710p42726.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Chris Barker

2016-04-05 18:16:14 UTC

Permalink

are the original x,y,z points aranged along a nice even grid? or
arbitrarily spaced?

if the former, I have Cython code that does that :-) I could dig it up,
haven't used it in a while. or scikit.image might have something.

If the latter, then Ben is right -- you NEED a spatial index --
scipy.spatial.kdtree will probably do what you want, though it would be
easier to use a sphere to average over than a cube.

Also, maybe Kernel Density Estimation could help here????

https://jakevdp.github.io/blog/2013/12/01/kernel-density-estimation/

Otherwise, you could use Cython to write a non-vectorized version of your
below code -- it would be order NM where N is the number of "cubes" and M
is the number of original points. I think, but would be a lot faster than
the pure python.

-CHB

Post by mpc
y_idx = min_y
z_idx = min_z
inside_block_points = np.where((x_buffer >= x_idx) &
(x_buffer <=
x_idx + x_block) &
(y_buffer >=
y_idx) &
(y_buffer <=
y_idx + y_block) &
(z_buffer >=
z_idx) &
(z_buffer <=
z_idx + z_block))

instead of where, you could loop through all your points and find the ones
inside your extents.

though now that I think about it -- you are mapping arbitrary points to a
regular grid, so you only need to go through the points once, assigning
each one to a bin, and then compute the average in each bin.

Is this almost a histogram?

-CHB
--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

***@noaa.gov

mpc

2016-04-05 17:55:22 UTC

Permalink

The points are indeed arbitrarily spaced, and yes I have heard tale of using
spatial indices for this sort of problem, and it looks like that would be
the best bet for me. Thanks for the other suggestions as well!

--
View this message in context: http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710p42732.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.

mpc

2016-04-05 18:09:40 UTC

Permalink

This wasn't intended to be a histogram, but you're right in that it would be
much better if I can just go through each point once and bin the results,
that makes more sense, thanks!

--
View this message in context: http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710p42733.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.

Eric Moore

2016-04-05 19:31:03 UTC

Permalink

def reduce_data(buffer, resolution):
thinned_buffer = np.zeros((resolution**3, 3))

min_xyz = buffer.min(axis=0)
max_xyz = buffer.max(axis=0)
delta_xyz = max_xyz - min_xyz

inds_xyz = np.floor(resolution * (buffer - min_xyz) /
delta_xyz).astype(int)

# handle values right at the max
inds_xyz[inds_xyz == resolution] -= 1

# covert to linear indices so that we can use np.add.at
inds_lin = inds_xyz[:,0]
inds_lin += inds_xyz[:,1] * resolution
inds_lin += inds_xyz[:,2] * resolution**2

np.add.at(thinned_buffer, inds_lin, buffer)
counts = np.bincount(inds_lin, minlength=resolution**3)

thinned_buffer[counts != 0, :] /= counts[counts != 0, None]
return thinned_buffer

The bulk of the time is spent in np.add.at, so just over 5 s here with your
1e7 to 1e6 example.

Post by mpc
This wasn't intended to be a histogram, but you're right in that it would be
much better if I can just go through each point once and bin the results,
that makes more sense, thanks!
--
http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710p42733.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Eric Moore

2016-04-05 19:42:50 UTC

Permalink

Eh. The order of the outputs will be different than your code, if that
makes a difference.

Post by Eric Moore
thinned_buffer = np.zeros((resolution**3, 3))
min_xyz = buffer.min(axis=0)
max_xyz = buffer.max(axis=0)
delta_xyz = max_xyz - min_xyz
inds_xyz = np.floor(resolution * (buffer - min_xyz) /
delta_xyz).astype(int)
# handle values right at the max
inds_xyz[inds_xyz == resolution] -= 1
# covert to linear indices so that we can use np.add.at
inds_lin = inds_xyz[:,0]
inds_lin += inds_xyz[:,1] * resolution
inds_lin += inds_xyz[:,2] * resolution**2
np.add.at(thinned_buffer, inds_lin, buffer)
counts = np.bincount(inds_lin, minlength=resolution**3)
thinned_buffer[counts != 0, :] /= counts[counts != 0, None]
return thinned_buffer
The bulk of the time is spent in np.add.at, so just over 5 s here with
your 1e7 to 1e6 example.

mpc

2016-04-05 18:56:49 UTC

Permalink

That's a very clever approach. I also found a way using the pandas library
with the groupby function.

points_df = pandas.DataFrame.from_records(buffer)
new_buffer = points_df.groupby(qcut(points_df.index, resolution**3)).mean()

I did the original approach with all of those loops because I need a way to
measure and report on progress, and although these advanced functions are
great, they are still asynchronous and blocking and provides no means of
indicating progress.

Still cool though, thanks!

--
View this message in context: http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710p42736.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.

Sebastian Berg

2016-04-05 18:19:56 UTC

Permalink

Another point is timing your actual code, in this case you could have
noticed that all time is spend in the while loops and little time in
those min/max calls before.

Algorithms, or what you do is the other thing. In the end, it seems
your code is just a high dimensional histogram. Though I am not sure if
numpy's histogram is fast, I am sure it vastly outperforms this and if
you are interested in how it does this, you could even check its code,
it is just in python (though numpy internally always has quite a lot of
fun boilerplate to make sure of corner cases).

And if you search for what you want to do first, you may find faster
solutions easily, batteries included and all, there are a lot of tools
out there. The other point is, don't optimize much if you don't know
exactly what you need to optimize.

- Sebastian

Post by mpc
* # Fill up buffer data for demonstration purposes with initial buffer of
size 10,000,000 to reduce to 1,000,000
size = 10000000
buffer = np.ndarray(shape=(size,3), dtype=np.float)
# fill it up
buffer[:, 0] = np.random.ranf(size)
buffer[:, 1] = np.random.ranf(size)
buffer[:, 2] = np.random.ranf(size)
# Create result buffer to size of cubed resolution (i.e. 100 ^ 3 =
1,000,000)
resolution = 100
thinned_buffer = np.ndarray(shape=(resolution ** 3,3),
dtype=np.float)
# Trying to convert the following into C to speed it up
x_buffer = buffer[:, 0]
y_buffer = buffer[:, 1]
z_buffer = buffer[:, 2]
min_x = x_buffer.min()
max_x = x_buffer.max()
min_y = y_buffer.min()
max_y = y_buffer.max()
min_z = z_buffer.min()
max_z = z_buffer.max()
z_block = (max_z - min_z) / resolution
x_block = (max_x - min_x) / resolution
y_block = (max_y - min_y) / resolution
current_idx = 0
x_idx = min_x
y_idx = min_y
z_idx = min_z
inside_block_points = np.where((x_buffer >= x_idx) &
(x_buffer <=
x_idx + x_block) &
(y_buffer >=
y_idx) &
(y_buffer <=
y_idx + y_block) &
(z_buffer >=
z_idx) &
(z_buffer <=
z_idx + z_block))
mean_point =
buffer[inside_block_points[0]].mean(axis=0)
thinned_buffer[current_idx] = mean_point
current_idx += 1
z_idx += z_block
y_idx += y_block
x_idx += x_block
return thin_buffer
*
--
View this message in context: http://numpy-discussion.10968.n7.nabble
.com/Multidimension-array-access-in-C-via-Python-API
-tp42710p42726.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Sebastian Berg

2016-04-05 18:29:22 UTC

Permalink

Post by Sebastian Berg

Hmm, well maybe not quite, but it seems similar like a weighted
histogram.

Post by Sebastian Berg
you are interested in how it does this, you could even check its code,
it is just in python (though numpy internally always has quite a lot of
fun boilerplate to make sure of corner cases).
And if you search for what you want to do first, you may find faster
solutions easily, batteries included and all, there are a lot of tools
out there. The other point is, don't optimize much if you don't know
exactly what you need to optimize.
- Sebastian

Post by mpc
* # Fill up buffer data for demonstration purposes with initial buffer of
size 10,000,000 to reduce to 1,000,000
size = 10000000
buffer = np.ndarray(shape=(size,3), dtype=np.float)
# fill it up
buffer[:, 0] = np.random.ranf(size)
buffer[:, 1] = np.random.ranf(size)
buffer[:, 2] = np.random.ranf(size)
# Create result buffer to size of cubed resolution (i.e. 100 ^
3
=
1,000,000)
resolution = 100
thinned_buffer = np.ndarray(shape=(resolution ** 3,3),
dtype=np.float)
# Trying to convert the following into C to speed it up
x_buffer = buffer[:, 0]
y_buffer = buffer[:, 1]
z_buffer = buffer[:, 2]
min_x = x_buffer.min()
max_x = x_buffer.max()
min_y = y_buffer.min()
max_y = y_buffer.max()
min_z = z_buffer.min()
max_z = z_buffer.max()
z_block = (max_z - min_z) / resolution
x_block = (max_x - min_x) / resolution
y_block = (max_y - min_y) / resolution
current_idx = 0
x_idx = min_x
y_idx = min_y
z_idx = min_z
inside_block_points = np.where((x_buffer >= x_idx) &
(x_buffer <=
x_idx + x_block) &
(y_buffer >=
y_idx) &
(y_buffer <=
y_idx + y_block) &
(z_buffer >=
z_idx) &
(z_buffer <=
z_idx + z_block))
mean_point =
buffer[inside_block_points[0]].mean(axis=0)
thinned_buffer[current_idx] = mean_point
current_idx += 1
z_idx += z_block
y_idx += y_block
x_idx += x_block
return thin_buffer
*
--
http://numpy-discussion.10968.n7.nabble
.com/Multidimension-array-access-in-C-via-Python-API
-tp42710p42726.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Antoine Pitrou

2016-04-05 17:24:30 UTC

Permalink

On Tue, 5 Apr 2016 08:39:39 -0700 (MST)

The Python and Numpy C API are generally not very user-friendly
compared to Python code, even hand-optimized.

Cython will let you write code that looks quite close to normal Python
code, but with additional annotations for better performance. Or you
can try Numba, a just-in-time compiler for scientific code that
understand Numpy arrays:
http://numba.pydata.org/

Regards

Antoine.

Stefan Seefeld

2016-04-05 17:42:24 UTC

Permalink

Post by Antoine Pitrou
On Tue, 5 Apr 2016 08:39:39 -0700 (MST)

The Python and Numpy C API are generally not very user-friendly
compared to Python code, even hand-optimized.
Cython will let you write code that looks quite close to normal Python
code, but with additional annotations for better performance. Or you
can try Numba, a just-in-time compiler for scientific code that
http://numba.pydata.org/

And just for the record: there is also Boost.Python, if you already have
a C++ library you want to bind to Python. There even is a Boost.NumPy
project, which however isn't formally part of Boost yet. Still, it may
be an inspiration...:

http://stefanseefeld.github.io/boost.python/doc/html/index.html
https://github.com/ndarray/Boost.NumPy

Regards,
Stefan

--
...ich hab' noch einen Koffer in Berlin...

Nathaniel Smith

2016-04-05 18:03:10 UTC

Permalink

I haven't had a chance to look carefully at the code you posted, but the
useful general rule is that code does not magically become faster by
writing it in C -- it only becomes faster if by writing it in C you are
able to write it in a way that you couldn't from python. Those high-level
Numpy C functions you started out using from C *are* the code that you're
using from python (e.g. writing arr1 + arr2 essentially just calls
PyArray_Add and so forth); there's a little bit of pure overhead from
python itself but it's surprisingly small. (I've seen estimates of 10-20%
for regular python code, and it should be much less than that for code like
yours that's spending most of its time inside numpy.)

Where you can really win is places where you go outside the Numpy C API --
replacing an element-by-element loop with C, that sort of thing.

Post by mpc
You're suggesting perhaps a cython approach, or perhaps a strictly C/C++
approach given the raw data?

Where cython shines is when you have a small amount of code that really
needs to be in C, and it's surrounded by / mixed in with code where regular
numpy operations are fine. Or just in general it's great for taking care of
all the annoying boilerplate involved in getting between python and C.

First though you need some idea of what, algorithmically, you want to
change about your implementation that will make it go faster, and why C/C++
might or might not help with implementing this strategy :-).

-n