Jaime Fernández del Río
2016-03-26 20:16:13 UTC
Hi all,
I have just submitted a PR (#7464 <https://github.com/numpy/numpy/pull/7464>)
that fixes an enhancement request (#6854
<https://github.com/numpy/numpy/issues/6854>), making np.bincount return an
array of the same type as the weights parameter. This is an important
deviation from current behavior, which always casts weights to double, and
always returns a double array, so I would like to hear what others think
about the worthiness of this. Main discussion points:
- np.bincount now works with complex weights (yay!), I guess this should
be a pretty uncontroversial enhancement.
- The return is of the same type as weights, which means that small
integers are very likely to overflow. This is exactly what #6854
requested, but perhaps we should promote the output for integers to a
long, as we do in np.sum?
- Boolean arrays stay boolean, and OR, rather than sum, the weights. Is
this what one would want? If we decide that integer promotion is the way to
go, perhaps booleans should go in the same pack?
- This new implementation currently supports all of the reasonable
native types, but has no fallback for user defined types. I guess we
should attempt to cast the array to double as before if no native loop can
be found? It would be good to have a way of testing this though, any
thoughts on how to go about this?
- Does a behavior change like this require some deprecation period? What
would that look like?
- I have also added broadcasting of weights to the full size of list, so
that one can do e.g. np.bincount([1, 2, 3], weights=2j) without having
to tile the single weight to the size of the bins list.
Any other thoughts are very welcome as well!
Jaime
I have just submitted a PR (#7464 <https://github.com/numpy/numpy/pull/7464>)
that fixes an enhancement request (#6854
<https://github.com/numpy/numpy/issues/6854>), making np.bincount return an
array of the same type as the weights parameter. This is an important
deviation from current behavior, which always casts weights to double, and
always returns a double array, so I would like to hear what others think
about the worthiness of this. Main discussion points:
- np.bincount now works with complex weights (yay!), I guess this should
be a pretty uncontroversial enhancement.
- The return is of the same type as weights, which means that small
integers are very likely to overflow. This is exactly what #6854
requested, but perhaps we should promote the output for integers to a
long, as we do in np.sum?
- Boolean arrays stay boolean, and OR, rather than sum, the weights. Is
this what one would want? If we decide that integer promotion is the way to
go, perhaps booleans should go in the same pack?
- This new implementation currently supports all of the reasonable
native types, but has no fallback for user defined types. I guess we
should attempt to cast the array to double as before if no native loop can
be found? It would be good to have a way of testing this though, any
thoughts on how to go about this?
- Does a behavior change like this require some deprecation period? What
would that look like?
- I have also added broadcasting of weights to the full size of list, so
that one can do e.g. np.bincount([1, 2, 3], weights=2j) without having
to tile the single weight to the size of the bins list.
Any other thoughts are very welcome as well!
Jaime
--
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.