Post by j***@gmail.comOn Wed, Oct 26, 2016 at 3:11 PM, Mathew S. Madhavacheril <
Post by Mathew S. MadhavacherilPost by Stephan HoyerPost by Stephan HoyerOn Wed, Oct 26, 2016 at 11:03 AM, Mathew S. Madhavacheril
Post by Mathew S. MadhavacherilPost by Stephan HoyerI wonder if the goals of this addition could be achieved by simply
adding
Post by Stephan HoyerPost by Mathew S. MadhavacherilPost by Stephan Hoyeran optional `cov` argument
to np.corr, which would provide a pre-computed covariance.
That's a fair suggestion which I'm happy to switch to. This
eliminates the
Post by Stephan HoyerPost by Mathew S. Madhavacherilneed for two new functions.
I'll add an optional `cov = False` argument to numpy.corrcoef that
returns
Post by Stephan HoyerPost by Mathew S. Madhavacherila tuple (corr, cov) instead.
Post by Stephan HoyerEither way, `covcorr` feels like a helper function that could exist
in
Post by Stephan HoyerPost by Mathew S. MadhavacherilPost by Stephan Hoyeruser code rather than numpy proper.
The user would have to re-implement the part that converts the
covariance
Post by Stephan HoyerPost by Mathew S. Madhavacherilmatrix to a correlation
coefficient. I made this PR to avoid that code duplication.
With the API I was envisioning (or even your proposed API, for that
matter),
Post by Stephan Hoyerthis function would only be a few lines, e.g.,
cov = np.cov(x)
corr = np.corrcoef(x, cov=cov)
IIUC, if you have a covariance matrix then you can compute the
correlation matrix directly, without looking at 'x', so corrcoef(x,
cov=cov) is a bit odd-looking. I think probably the API that makes the
most sense is just to expose something like the covtocorr function
(maybe it could have a less telegraphic name?)? And then, yeah, users
can use that to build their own covcorr or whatever if they want it.
Right, agreed, this is why I said `x` becomes redundant when `cov` is
specified
1) Have `np.corrcoef` accept a boolean optional argument `covmat = False`
that lets
one obtain a tuple containing the covariance and the correlation matrices
in the same call
2) Modify my original PR so that `np.covtocorr` remains (with possibly a
better
name) but remove `np.covcorr` since this is easy for the user to add.
My preference is option 2.
cov2corr is a useful function
http://www.statsmodels.org/dev/generated/statsmodels.stats.
moment_helpers.cov2corr.html
I also wrote the inverse function corr2cov, but AFAIR use it only in some
test cases.
I don't think adding any of the options to corrcoef or covcor is useful
since there is no computational advantage to it.
I'm not sure I agree with that statement. If a user wants to calculate both
a covariance and correlation matrix,
they currently have two options:
A) Call np.cov and np.corrcoef separately, which takes at least twice as
long as one call to np.cov. For data-sets that
I am used to, a np.cov call takes 5-10 seconds.
B) Call np.cov and then separately implement their own correlation matrix
code, which means the user
isn't able to fully take advantage of code that is already in numpy.
In any case, I've updated the PR:
https://github.com/numpy/numpy/pull/8211
Relative to my original PR, it:
a) removes the numpy.covcorr function which the user can easily implement
b) have numpy.cov2corr be the function exposed in the API (previously
called numpy.covtocorr in the PR), which accepts a pre-calculated covariance
matrix
c) have numpy.corrcoef call numpy.cov2corr
Post by j***@gmail.comPost by Mathew S. Madhavacheril_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion