Concepts Sparsity Metrics¶

Concept sparsity metrics evaluate the sparsity of the concept-space activations. They take in the concept_explainer and the latent_activations, compute the concept_activations and then compute the sparsity of the concept_activations.

from interpreto.concepts.metrics import MetricClass

metric = MetricClass(concept_explainer)
score = metric.compute(activations)

Sparsity¶

interpreto.concepts.metrics.Sparsity ¶

Sparsity(concept_explainer, epsilon=0.0)

Code concepts/metrics/sparsity_metrics.py

Evaluates the sparsity of the concepts activations. It takes in the concept_explainer and the latent_activations, compute the concept_activations and then compute the sparsity of the concept_activations.

The sparsity is defined as: $$ \sum_{x}^{X} \sum_{i=1}^{cpt} \mathbb{1} ( | t(h(x))_i | > \epsilon ) $$ TODO: make the formula work

Attributes:

Name	Type	Description
`concept_explainer`	`ConceptEncoderExplainer`	The explainer used to compute concepts.
`epsilon`	`float`	The threshold used to compute the sparsity.

Source code in interpreto/concepts/metrics/sparsity_metrics.py

def __init__(self, concept_explainer: ConceptEncoderExplainer, epsilon: float = 0.0):
    self.concept_explainer = concept_explainer
    self.epsilon = epsilon

compute ¶

compute(latent_activations)

Compute the metric.

Parameters:

Name	Type	Description	Default
`latent_activations` ¶	`LatentActivations \| dict[str, LatentActivations]`	The latent activations.	required

Returns:

Name	Type	Description
`float`	`float`	The metric.

Source code in interpreto/concepts/metrics/sparsity_metrics.py

def compute(self, latent_activations: LatentActivations | dict[str, LatentActivations]) -> float:
    """Compute the metric.

    Args:
        latent_activations (LatentActivations | dict[str, LatentActivations]): The latent activations.

    Returns:
        float: The metric.
    """
    split_latent_activations: LatentActivations = self.concept_explainer._sanitize_activations(latent_activations)

    concepts_activations: ConceptsActivations = self.concept_explainer.encode_activations(split_latent_activations)

    return torch.mean(torch.abs(concepts_activations) > self.epsilon, dtype=torch.float32).item()

Sparsity Ratio¶

interpreto.concepts.metrics.SparsityRatio ¶

SparsityRatio(concept_explainer, epsilon=0.0)

Bases: Sparsity

Code concepts/metrics/sparsity_metrics.py

Evaluates the sparsity ratio of the concepts activations. It takes in the concept_explainer and the latent_activations, compute the concept_activations and then compute the sparsity ratio of the concept_activations.

With $A$ latent activations obtained through $A = h(X)$, the sparsity ratio is defined as: $$ (1 / cpt) * \sum_{a}^{A} \sum_{i=1}^{cpt} \mathbb{1} ( | t(a)_i | > \epsilon ) $$ TODO: make the formula work

Attributes:

Name	Type	Description
`concept_explainer`	`ConceptEncoderExplainer`	The explainer used to compute concepts.
`epsilon`	`float`	The threshold used to compute the sparsity.

Source code in interpreto/concepts/metrics/sparsity_metrics.py

def __init__(self, concept_explainer: ConceptEncoderExplainer, epsilon: float = 0.0):
    self.concept_explainer = concept_explainer
    self.epsilon = epsilon

compute ¶

compute(latent_activations)

Compute the metric.

Parameters:

Name	Type	Description	Default
`latent_activations` ¶	`LatentActivations \| dict[str, LatentActivations]`	The latent activations.	required

Returns:

Name	Type	Description
`float`	`float`	The metric.

Source code in interpreto/concepts/metrics/sparsity_metrics.py

def compute(self, latent_activations: LatentActivations | dict[str, LatentActivations]) -> float:
    """Compute the metric.

    Args:
        latent_activations (LatentActivations | dict[str, LatentActivations]): The latent activations.

    Returns:
        float: The metric.
    """
    sparsity = super().compute(latent_activations)
    return sparsity / self.concept_explainer.concept_model.nb_concepts

Concepts Sparsity Metrics¶

Sparsity¶

interpreto.concepts.metrics.Sparsity ¶

compute ¶

`latent_activations` ¶

Sparsity Ratio¶

interpreto.concepts.metrics.SparsityRatio ¶

compute ¶

`latent_activations` ¶

Concepts Sparsity Metrics¶

Sparsity¶

interpreto.concepts.metrics.Sparsity ¶

compute ¶

latent_activations ¶

Sparsity Ratio¶

interpreto.concepts.metrics.SparsityRatio ¶

compute ¶

latent_activations ¶

`latent_activations` ¶

`latent_activations` ¶