Gradient Shap¶

Bases: MultitaskExplainerMixin, AttributionExplainer

GradientSHAP is a gradient-based Shapley value estimator that computes attributions by integrating model gradients along a path between a baseline (reference) and the input. It approximates Shapley values by averaging multiple stochastic integrated gradients across randomly sampled paths.

By combining ideas from Integrated Gradients and Shapley value theory, GradientSHAP provides additive feature attributions with strong consistency guarantees, while capturing non-linear effects.

Reference: Lundberg and Lee (2017). A Unified Approach to Interpreting Model Predictions. Paper

Examples:

>>> from interpreto import GradientShap
>>> method = GradientShap(model, tokenizer, batch_size=4,
>>>                       n_perturbations=20,
>>>                       baseline=0,
>>>                       noise_std=0.1,)
>>> explanations = method(text)

Parameters:

Name	Type	Description	Default
`model` ¶	`PreTrainedModel`	model to explain	required
`tokenizer` ¶	`PreTrainedTokenizer`	Hugging Face tokenizer associated with the model	required
`batch_size` ¶	`int`	batch size for the attribution method	`4`
`granularity` ¶	`Granularity`	The level of granularity for the explanation. Options are: `ALL_TOKENS`, `TOKEN`, `WORD`, or `SENTENCE`. Defaults to Granularity.WORD. To obtain it, `from interpreto import Granularity` then `Granularity.WORD`.	`WORD`
`granularity_aggregation_strategy` ¶	`GranularityAggregationStrategy`	how to aggregate token-level attributions into granularity scores. Options are: MEAN, MAX, MIN, SUM, and SIGNED_MAX. Ignored for `granularity` set to `ALL_TOKENS` or `TOKEN`.	`MEAN`
`device` ¶	`device`	device on which the attribution method will be run	`None`
`inference_mode` ¶	`Callable[[Tensor], Tensor]`	The mode used for inference. It can be either one of LOGITS, SOFTMAX, or LOG_SOFTMAX. Use InferenceModes to choose the appropriate mode.	`LOGITS`
`input_x_gradient` ¶	`bool`	If True, multiplies the input embeddings with their gradients before aggregation. Defaults to `True`.	`True`
`n_perturbations` ¶	`int`	the number of interpolations to generate	`10`
`baseline` ¶	`Tensor \| float \| None`	the baseline to use for the interpolations	`None`
`noise_std` ¶	`float`	the standard deviation of the noise added to the baseline	`0.1`

Gradient Shap¶

`model` ¶

`tokenizer` ¶

`batch_size` ¶

`granularity` ¶

`granularity_aggregation_strategy` ¶

`device` ¶

`inference_mode` ¶

`input_x_gradient` ¶

`n_perturbations` ¶

`baseline` ¶

`noise_std` ¶

Gradient Shap¶

model ¶

tokenizer ¶

batch_size ¶

granularity ¶

granularity_aggregation_strategy ¶

device ¶

inference_mode ¶

input_x_gradient ¶

n_perturbations ¶

baseline ¶

noise_std ¶

`model` ¶

`tokenizer` ¶

`batch_size` ¶

`granularity` ¶

`granularity_aggregation_strategy` ¶

`device` ¶

`inference_mode` ¶

`input_x_gradient` ¶

`n_perturbations` ¶

`baseline` ¶

`noise_std` ¶