Deletion¶
Bases: InsertionDeletionBase
The deletion metric measures the quality of an attribution method by evaluating how the prediction score of a model drops when the most important elements of a sequence are gradually removed. The importance of the elements is determined by the attribution-based method.
A curve is built by computing the prediction score while iteratively masking the most important elements, starting from the whole sequence. The scores are the softmax outputs, between 0 and 1. The area under this curve (AUC) is then computed to quantify the quality of the attribution method. A lower AUC is better.
The evaluate method returns both:
- the average AUC across all sequences and targets,
- for each sequence-target pair, the softmax scores associated to the successive deletions. The softmax scores are preferred over logits as they are bounded between 0 and 1, which makes the AUC more interpretable.
An attribution method is considered good if the AUC is low, meaning that the model's prediction score decreases significantly as the most important elements are removed from the sequence. Conversely, a high AUC indicates that the attribution method is not effective in identifying the most important elements for the model's prediction.
This metric only evaluates the order of importance of the elements in the sequence, not their actual values.
Examples:
>>> from interpreto.attributions.metrics import Deletion
>>>
>>> # Get explanations from an attribution method
>>> explainer = Method(model, tokenizer, kwargs)
>>> explanations = explainer(inputs, targets)
>>>
>>> # Run the deletion metric
>>> metric = Deletion(model, tokenizer, n_perturbations=100)
>>> auc, metric_scores = metric.evaluate(explanations)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
PreTrainedModel
|
model used to generate explanations |
required |
|
PreTrainedTokenizer
|
Hugging Face tokenizer associated with the model |
required |
|
int
|
batch size for the inference of the metric |
4
|
|
Granularity
|
granularity level of the perturbations (token, word, sentence, etc.) |
required |
|
device
|
device on which the attribution method will be run |
None
|
|
int
|
number of perturbations from which the metric will be computed (i.e. the number of steps from which the AUC will be computed). |
100
|
|
float
|
maximum percentage of elements in the sequence to be perturbed. Defaults to 1.0, meaning that all elements can be perturbed. If set to 0.5, only half of the elements will be perturbed (i.e. only the 50% most important). This is useful to avoid perturbing too many elements with low scores in long sequences. |
1.0
|