# Tutorials¶

## Independence Tests¶

The independence testing problem is generalized as follows: consider random variables $$X$$ and $$Y$$ that have joint density $$F_{XY} = F_{X|Y} F_Y$$. We are testing:

$\begin{split}H_0: F_{XY} &= F_X F_Y \\ H_A: F_{XY} &\neq F_X F_Y\end{split}$

These tutorials overview how to use these tests as well as benchmarks comparing the algorithms included against each other.

## K-sample Tests¶

The k-sample testing problem is generalized as follows: consider random variables $$X_1, X_2, \ldots, X_k$$ that have densities $$F_1, F_2, \ldots, F_k$$. Then, we are testing

$\begin{split}H_0:\ &F_1 = F_2 = \ldots F_k \\ H_A:\ &\exists \ j \neq j' \text{ s.t. } F_j \neq F_{j'}\end{split}$

This tutorial overview how to use k-sample tests in hyppo.

## Time-Series Tests¶

Time-series tests of independence consider the following problem: consider random variables $$X$$ and $$Y$$ with joint density $$F_{XY}$$ and marginal densities $$F_X$$ and $$F_Y$$. Let $$F_{X_t}$$, $$F_{Y_s}$$, and $$F_{X_t Y_s}$$ represent the marginal and joint distributions of time-indexed random varlables $$X_t$$ and $$Y_s$$ at timesteps $$t$$ and $$s$$. Let $$\{ (X_t, Y_t) \}_{t = -\infty}^\infty$$ be a full jointly-sampled strictly stationary time series with the observed sample $$\{ (X_1, Y_1), \ldots (X_n, Y_n) \}$$. Choose some nonnegative integer $$M$$ as the maximium lag hyperparamater. Then we are testing,

$\begin{split}H_0: F_{X_t Y_{t - j}} &= F_{X_t} F_{Y_{t - j}} \text{ for each } j \in \{ 0, 1, \ldots, M \} \\ H_A: F_{X_t Y_{t - j}} &\neq F_{X_t} F_{Y_{t - j}} \text{ for some } j \in \{ 0, 1, \ldots, M \}\end{split}$

This tutorial overview how to use time_series based tests in hyppo.

## Sims¶

To evaluate existing implmentations and benchmark against other packages, we have developed a suite of 20 dependency structures. The simulation settings include polynomial (linear, quadratic, cubic), trigonometric (sinusoidal, circular, ellipsoidal, spiral), geometric (square, diamond, w-shaped), and other functions. We also include 3 sample Gaussian simulations as well, which are sampled from multivariate normal distribusions.