Hypothesis testing
For examples and interpretation, see my notebook on hypothesis test explanations.
gshap.hypothesis.HypothesisTest
class gshap.hypothesis.HypothesisTest(test, bootstrap_samples=1000) [source]
This class measures how likely a hypothesis is to be true of an output vector. It uses a bootstrap analysis to compute the probability that a hypothesis is true of a population from a sample output vector.
Parameters: | test : callable
Takes an output vector and returns a boolean indicator that the hypothesis is true of the output vector. This will usually involve computing a sample statistic of the output vector, then returning an indicator that the sample statistic fell within a certain range. bootstrap_samples : int, default=1000Number of bootstrap samples for hypothesis testing. |
---|---|
Attributes: | test : callable
Set from the
Set from the |
Examples
import gshap
from gshap.hypothesis import HypothesisTest
from sklearn.datasets import load_diabetes
from sklearn.linear_model import Lasso
X, y = load_diabetes(return_X_y=True)
reg = Lasso(alpha=.1).fit(X, y)
test = lambda y_pred: y_pred.mean() > 155
g = HypothesisTest(test, bootstrap_samples=100)
explainer = gshap.KernelExplainer(reg.predict, X, g)
# artifically select a sample which with higher-than-average y
explainer.gshap_values(X[y > 70], nsamples=100)
Out:
array([-0.0069, 0.0253, 0.2572, 0.1112, -0.0108, -0.0105, 0.0317,
0.0009, 0.1415, 0.0071])
Methods
__call__(self, output) [source]
Computes the probablity of the hypothesis being true of the population from which the sample was drawn.
Parameters: | output : numpy.array
(# observations, # targets) vector of model outputs. |
---|---|
Returns: | probability : scalar between 0 and 1
Probability that the hypothesis is true of the population from which the sample was drawn. |