sim_matrix

hyppo.independence.sim_matrix(model, x)

Computes the similarity matrix from a random forest.

The model used must follow the scikit-learn API. That is, the model is a random forest class and is already trained (use fit) before running this function. Also, apply must be used to push input data down the trained forest. See sklearn.ensemble.RandomForestClassifier and sklearn.ensemble.RandomForestRegressor for similar classes.

Parameters
  • model (class) -- A trained random forest object for which the similarity matrix will be calculated.

  • x (ndarray) -- Input data matrice. x must have shape must be (n, p) where n is the number of samples and p are the number of dimensions.

Returns

proxMat (ndarray) -- The proximity matrix induced by random kernel.