PartialCorr¶
- class hyppo.conditional.PartialCorr(**kwargs)¶
Conditional Pearson's correlation test.
Partial correlation is a measure of the association between two univariate variables given a third univariate variable.
- Parameters
compute_distance (
str
,callable
, orNone
, default:"euclidean"
) -- A function that computes the distance among the samples within each data matrix. Valid strings forcompute_distance
are, as defined insklearn.metrics.pairwise_distances
,From scikit-learn: [
"euclidean"
,"cityblock"
,"cosine"
,"l1"
,"l2"
,"manhattan"
] See the documentation forscipy.spatial.distance
for details on these metrics.From scipy.spatial.distance: [
"braycurtis"
,"canberra"
,"chebyshev"
,"correlation"
,"dice"
,"hamming"
,"jaccard"
,"kulsinski"
,"mahalanobis"
,"minkowski"
,"rogerstanimoto"
,"russellrao"
,"seuclidean"
,"sokalmichener"
,"sokalsneath"
,"sqeuclidean"
,"yule"
] See the documentation forscipy.spatial.distance
for details on these metrics.
Set to
None
or"precomputed"
ifx
andy
are already distance matrices. To call a custom function, either create the distance matrix before-hand or create a function of the formmetric(x, **kwargs)
wherex
is the data matrix for which pairwise distances are calculated and**kwargs
are extra arguements to send to your custom function.**kwargs -- Arbitrary keyword arguments for
compute_distance
.
Notes
The statistic is computed as follows:
\[r_{x, y ; z} = \frac{\rho_{xy} - \rho_{xz} \rho_{yz}}{\sqrt{(1 - \rho_{xz}^2)(1 - \rho_{yz}^2)}}\]where \(\rho_{xy}\) is the Pearson correlation coefficient between \(x\) and \(y\). The partial correlation test is implemented as a t-test footcite:p:legendre2000:.
References
Methods Summary
|
Helper function that calculates the partial correlation test statistic. |
|
Calculates the partial correlation test statistic and p-value. |
- PartialCorr.statistic(x, y, z)¶
Helper function that calculates the partial correlation test statistic.
- Parameters
x,y,z (
ndarray
offloat
) -- Input data matrices.x
,y
andz
must have the same number of samples. That is, the shapes must be(n, p)
,(n, q)
and(n, r)
where n is the number of samples and p, q, and r are the number of dimensions. Alternatively,x
andy
can be distance matrices andz
can be a similarity matrix where the shapes must be(n, n)
.- Returns
stat (
float
) -- The computed partial correlation test statistic.
- PartialCorr.test(x, y, z, reps=1000, workers=1, auto=True, perm_blocks=None, random_state=None)¶
Calculates the partial correlation test statistic and p-value.
- Parameters
x,y,z (
ndarray
offloat
) -- Input data matrices.x
,y
andz
must have the same number of samples. That is, the shapes must be(n, 1)
,(n, 1)
and(n, 1)
where n is the number of samples and p, q, and r are the number of dimensions.reps (
int
, default:1000
) -- The number of replications used to estimate the null distribution when using the permutation test used to calculate the p-value.workers (
int
, default:1
) -- The number of cores to parallelize the p-value computation over. Supply-1
to use all cores available to the Process.auto (
bool
, default:True
) -- If True, the p-value is computed by t-distribution approximation. Parametersreps
andworkers
are irrelevant in this case. Otherwise,hyppo.tools.perm_test
will be run.perm_blocks (
None
orndarray
, default:None
) -- Defines blocks of exchangeable samples during the permutation test. If None, all samples can be permuted with one another. Requires n rows. At each column, samples with matching column value are recursively partitioned into blocks of samples. Within each final block, samples are exchangeable. Blocks of samples from the same partition are also exchangeable between one another. If a column value is negative, that block is fixed and cannot be exchanged.random_state (
int
, default:None
) -- The random_state for permutation testing to be fixed for reproducibility.
- Returns