Discussion:
[Numpy-discussion] Python 3 dict support (issue #5718)
Hannah
2016-07-20 02:28:04 UTC
Permalink
Hi,
I started venturing down the rabbit hole of trying to write a patch to add
support for numpy to convert python 3 dictionary keys
(collections.abc.ViewMapping objects), which is open issue #5718 and am
having trouble orienting myself. I'm unclear as to where the python entry
point into array is (basically, what function np.array drops into and if
this is in Python or C) and where/what language (fine with writing either)
a patch that supports MappingViews would go. Any help getting oriented
would be much appreciated.

The reasoning for the patch is s that dicts are one of the most common
Python datatypes and this specifically is because of an upstream issue of
wanting dict support in matplotlib.

Thanks,
Hannah
Jaime Fernández del Río
2016-07-20 12:52:55 UTC
Permalink
Post by Hannah
Hi,
I started venturing down the rabbit hole of trying to write a patch to add
support for numpy to convert python 3 dictionary keys
(collections.abc.ViewMapping objects), which is open issue #5718 and am
having trouble orienting myself. I'm unclear as to where the python entry
point into array is (basically, what function np.array drops into and if
this is in Python or C) and where/what language (fine with writing either)
a patch that supports MappingViews would go. Any help getting oriented
would be much appreciated.
Hi Hannah,

ǹp.array is written in C, and is part of the multiarray module that is
defined here:

https://github.com/numpy/numpy/blob/maintenance/1.11.x/numpy/core/src/multiarray/multiarraymodule.c

The "array" name is mapped here:

https://github.com/numpy/numpy/blob/maintenance/1.11.x/numpy/core/src/multiarray/multiarraymodule.c#L4093

to the function _array_fromobject defined here:

https://github.com/numpy/numpy/blob/maintenance/1.11.x/numpy/core/src/multiarray/multiarraymodule.c#L1557

That functions does some checking and has a couple of fast paths for the
case where the input is already an array or a subclass, but for the general
case it relies on PyArray_CheckFromAny:

https://github.com/numpy/numpy/blob/maintenance/1.11.x/numpy/core/src/multiarray/ctors.c#L1848

which in turn calls Pyarray_FromAny:

https://github.com/numpy/numpy/blob/maintenance/1.11.x/numpy/core/src/multiarray/ctors.c#L1674

You will also haveto take a look at what goes on in
PyArray_GetArrayParamsFromObject, which gets called by PyArray_FromAny;

https://github.com/numpy/numpy/blob/maintenance/1.11.x/numpy/core/src/multiarray/ctors.c#L1428

as well as several other places, but I think they are all (or most of them)
in ctors.c.

You may also want to take a llok at PyArray_FromIter, which is the function
that ultimately takes care of calls to np.fromiter:

https://github.com/numpy/numpy/blob/maintenance/1.11.x/numpy/core/src/multiarray/ctors.c#L3657

It's messy, but not that bad once you get used to it: good luck!

Jaime
Post by Hannah
The reasoning for the patch is s that dicts are one of the most common
Python datatypes and this specifically is because of an upstream issue of
wanting dict support in matplotlib.
Thanks,
Hannah
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
--
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
Joseph Fox-Rabinovitz
2016-07-20 15:07:15 UTC
Permalink
Jaime,

This is a great intro for people looking to jump into the C side of
things. I have been trying to figure out which bits are the important
ones from looking at the code and the docs. Your post cut out most of
the confusion. Is there some way you would consider adding something
like this this to the docs?

-Joe


On Wed, Jul 20, 2016 at 8:52 AM, Jaime Fernández del Río
Post by Jaime Fernández del Río
Post by Hannah
Hi,
I started venturing down the rabbit hole of trying to write a patch to add
support for numpy to convert python 3 dictionary keys
(collections.abc.ViewMapping objects), which is open issue #5718 and am
having trouble orienting myself. I'm unclear as to where the python entry
point into array is (basically, what function np.array drops into and if
this is in Python or C) and where/what language (fine with writing either) a
patch that supports MappingViews would go. Any help getting oriented would
be much appreciated.
Hi Hannah,
ǹp.array is written in C, and is part of the multiarray module that is
https://github.com/numpy/numpy/blob/maintenance/1.11.x/numpy/core/src/multiarray/multiarraymodule.c
https://github.com/numpy/numpy/blob/maintenance/1.11.x/numpy/core/src/multiarray/multiarraymodule.c#L4093
https://github.com/numpy/numpy/blob/maintenance/1.11.x/numpy/core/src/multiarray/multiarraymodule.c#L1557
That functions does some checking and has a couple of fast paths for the
case where the input is already an array or a subclass, but for the general
https://github.com/numpy/numpy/blob/maintenance/1.11.x/numpy/core/src/multiarray/ctors.c#L1848
https://github.com/numpy/numpy/blob/maintenance/1.11.x/numpy/core/src/multiarray/ctors.c#L1674
You will also haveto take a look at what goes on in
PyArray_GetArrayParamsFromObject, which gets called by PyArray_FromAny;
https://github.com/numpy/numpy/blob/maintenance/1.11.x/numpy/core/src/multiarray/ctors.c#L1428
as well as several other places, but I think they are all (or most of them)
in ctors.c.
You may also want to take a llok at PyArray_FromIter, which is the function
https://github.com/numpy/numpy/blob/maintenance/1.11.x/numpy/core/src/multiarray/ctors.c#L3657
It's messy, but not that bad once you get used to it: good luck!
Jaime
Post by Hannah
The reasoning for the patch is s that dicts are one of the most common
Python datatypes and this specifically is because of an upstream issue of
wanting dict support in matplotlib.
Thanks,
Hannah
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
--
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de
dominación mundial.
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Hannah
2016-07-20 19:39:11 UTC
Permalink
I second (and third & fourth &...) this

Thanks so much for this, Jaime, it's exactly what I was looking for and
couldn't find. Maybe this can be linked to in the contribute docs as an
"orienting yourself in the codebase" page?

On Wed, Jul 20, 2016 at 11:07 AM, Joseph Fox-Rabinovitz <
Post by Joseph Fox-Rabinovitz
Jaime,
This is a great intro for people looking to jump into the C side of
things. I have been trying to figure out which bits are the important
ones from looking at the code and the docs. Your post cut out most of
the confusion. Is there some way you would consider adding something
like this this to the docs?
-Joe
On Wed, Jul 20, 2016 at 8:52 AM, Jaime Fernández del Río
Post by Jaime Fernández del Río
Post by Hannah
Hi,
I started venturing down the rabbit hole of trying to write a patch to
add
Post by Jaime Fernández del Río
Post by Hannah
support for numpy to convert python 3 dictionary keys
(collections.abc.ViewMapping objects), which is open issue #5718 and am
having trouble orienting myself. I'm unclear as to where the python
entry
Post by Jaime Fernández del Río
Post by Hannah
point into array is (basically, what function np.array drops into and if
this is in Python or C) and where/what language (fine with writing
either) a
Post by Jaime Fernández del Río
Post by Hannah
patch that supports MappingViews would go. Any help getting oriented
would
Post by Jaime Fernández del Río
Post by Hannah
be much appreciated.
Hi Hannah,
ǹp.array is written in C, and is part of the multiarray module that is
https://github.com/numpy/numpy/blob/maintenance/1.11.x/numpy/core/src/multiarray/multiarraymodule.c
https://github.com/numpy/numpy/blob/maintenance/1.11.x/numpy/core/src/multiarray/multiarraymodule.c#L4093
https://github.com/numpy/numpy/blob/maintenance/1.11.x/numpy/core/src/multiarray/multiarraymodule.c#L1557
Post by Jaime Fernández del Río
That functions does some checking and has a couple of fast paths for the
case where the input is already an array or a subclass, but for the
general
https://github.com/numpy/numpy/blob/maintenance/1.11.x/numpy/core/src/multiarray/ctors.c#L1848
https://github.com/numpy/numpy/blob/maintenance/1.11.x/numpy/core/src/multiarray/ctors.c#L1674
Post by Jaime Fernández del Río
You will also haveto take a look at what goes on in
PyArray_GetArrayParamsFromObject, which gets called by PyArray_FromAny;
https://github.com/numpy/numpy/blob/maintenance/1.11.x/numpy/core/src/multiarray/ctors.c#L1428
Post by Jaime Fernández del Río
as well as several other places, but I think they are all (or most of
them)
Post by Jaime Fernández del Río
in ctors.c.
You may also want to take a llok at PyArray_FromIter, which is the
function
https://github.com/numpy/numpy/blob/maintenance/1.11.x/numpy/core/src/multiarray/ctors.c#L3657
Post by Jaime Fernández del Río
It's messy, but not that bad once you get used to it: good luck!
Jaime
Post by Hannah
The reasoning for the patch is s that dicts are one of the most common
Python datatypes and this specifically is because of an upstream issue
of
Post by Jaime Fernández del Río
Post by Hannah
wanting dict support in matplotlib.
Thanks,
Hannah
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
--
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus
planes de
Post by Jaime Fernández del Río
dominación mundial.
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Jaime Fernández del Río
2016-07-21 10:07:59 UTC
Permalink
Post by Hannah
I second (and third & fourth &...) this
Thanks so much for this, Jaime, it's exactly what I was looking for and
couldn't find. Maybe this can be linked to in the contribute docs as an
"orienting yourself in the codebase" page?
Glad to know it helped, realizing how this works was one of the big leaps
in understanding the code base for myself too. We (my wife and I) have
sent the kids on vacation with their grandparents, so when we are done
enjoying our shortly recovered freedom hunting for Pokemons around Zurich,
I'll try to find time to write this up in a form suitable for the docs,
hopefully over the next couple of weeks.

I have been known to not keep commitments like this in the past, so don't
hold your breath just in case...

Jaime
Post by Hannah
On Wed, Jul 20, 2016 at 11:07 AM, Joseph Fox-Rabinovitz <
Post by Joseph Fox-Rabinovitz
Jaime,
This is a great intro for people looking to jump into the C side of
things. I have been trying to figure out which bits are the important
ones from looking at the code and the docs. Your post cut out most of
the confusion. Is there some way you would consider adding something
like this this to the docs?
-Joe
On Wed, Jul 20, 2016 at 8:52 AM, Jaime Fernández del Río
Post by Jaime Fernández del Río
Post by Hannah
Hi,
I started venturing down the rabbit hole of trying to write a patch to
add
Post by Jaime Fernández del Río
Post by Hannah
support for numpy to convert python 3 dictionary keys
(collections.abc.ViewMapping objects), which is open issue #5718 and am
having trouble orienting myself. I'm unclear as to where the python
entry
Post by Jaime Fernández del Río
Post by Hannah
point into array is (basically, what function np.array drops into and
if
Post by Jaime Fernández del Río
Post by Hannah
this is in Python or C) and where/what language (fine with writing
either) a
Post by Jaime Fernández del Río
Post by Hannah
patch that supports MappingViews would go. Any help getting oriented
would
Post by Jaime Fernández del Río
Post by Hannah
be much appreciated.
Hi Hannah,
ǹp.array is written in C, and is part of the multiarray module that is
https://github.com/numpy/numpy/blob/maintenance/1.11.x/numpy/core/src/multiarray/multiarraymodule.c
https://github.com/numpy/numpy/blob/maintenance/1.11.x/numpy/core/src/multiarray/multiarraymodule.c#L4093
https://github.com/numpy/numpy/blob/maintenance/1.11.x/numpy/core/src/multiarray/multiarraymodule.c#L1557
Post by Jaime Fernández del Río
That functions does some checking and has a couple of fast paths for the
case where the input is already an array or a subclass, but for the
general
https://github.com/numpy/numpy/blob/maintenance/1.11.x/numpy/core/src/multiarray/ctors.c#L1848
https://github.com/numpy/numpy/blob/maintenance/1.11.x/numpy/core/src/multiarray/ctors.c#L1674
Post by Jaime Fernández del Río
You will also haveto take a look at what goes on in
PyArray_GetArrayParamsFromObject, which gets called by PyArray_FromAny;
https://github.com/numpy/numpy/blob/maintenance/1.11.x/numpy/core/src/multiarray/ctors.c#L1428
Post by Jaime Fernández del Río
as well as several other places, but I think they are all (or most of
them)
Post by Jaime Fernández del Río
in ctors.c.
You may also want to take a llok at PyArray_FromIter, which is the
function
https://github.com/numpy/numpy/blob/maintenance/1.11.x/numpy/core/src/multiarray/ctors.c#L3657
Post by Jaime Fernández del Río
It's messy, but not that bad once you get used to it: good luck!
Jaime
Post by Hannah
The reasoning for the patch is s that dicts are one of the most common
Python datatypes and this specifically is because of an upstream issue
of
Post by Jaime Fernández del Río
Post by Hannah
wanting dict support in matplotlib.
Thanks,
Hannah
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
--
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus
planes de
Post by Jaime Fernández del Río
dominación mundial.
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
--
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
Marten van Kerkwijk
2016-07-21 13:12:14 UTC
Permalink
I know it is slightly obnoxious to hold the "making a suggestion is to
volunteer for it" -- but usually a PR to the docs is best made by someone
who is trying to understand it rather than someone who already knows
everything........
-- Marten
Joseph Fox-Rabinovitz
2016-07-21 14:46:40 UTC
Permalink
I agree with you, and in that spirit I started working on something. I
do welcome suggestions on where to place the new page though.

Regards,

-Joe


On Thu, Jul 21, 2016 at 9:12 AM, Marten van Kerkwijk
Post by Marten van Kerkwijk
I know it is slightly obnoxious to hold the "making a suggestion is to
volunteer for it" -- but usually a PR to the docs is best made by someone
who is trying to understand it rather than someone who already knows
everything........
-- Marten
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Marten van Kerkwijk
2016-07-21 16:09:15 UTC
Permalink
Yes, indeed, where should this be!?

The logical place would be in the developer documentation, under
"understanding the code & getting started" [1], but that is right now just
a single paragraph.

Others should definitely correct me if wrong, but I *think* this simply
does not exist, and would suggest to move that subsection to its own
section, and make your contribution a first entry in it ("the path to
creating an array" or so). Hopefully, someone else will at some point add,
e.g., a section on how ufuncs work (I always get quite confused following
the long chain there...). Or even something as simple as copying an array
(where transposing axes can have major impacts on speed).

All the best,

Marten

[1]
http://docs.scipy.org/doc/numpy-dev/dev/development_environment.html#understanding-the-code-getting-started
​
Joseph Fox-Rabinovitz
2016-07-21 17:10:22 UTC
Permalink
I will do my best. I am not that familiar with rst or numpy docs, but
that's what PRs are for after all.

-Joe

On Thu, Jul 21, 2016 at 12:09 PM, Marten van Kerkwijk
Post by Marten van Kerkwijk
Yes, indeed, where should this be!?
The logical place would be in the developer documentation, under
"understanding the code & getting started" [1], but that is right now just a
single paragraph.
Others should definitely correct me if wrong, but I *think* this simply does
not exist, and would suggest to move that subsection to its own section, and
make your contribution a first entry in it ("the path to creating an array"
or so). Hopefully, someone else will at some point add, e.g., a section on
how ufuncs work (I always get quite confused following the long chain
there...). Or even something as simple as copying an array (where
transposing axes can have major impacts on speed).
All the best,
Marten
[1]
http://docs.scipy.org/doc/numpy-dev/dev/development_environment.html#understanding-the-code-getting-started
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Loading...