Skip to content

Concepts Sparsity Metrics

Concept sparsity metrics evaluate the sparsity of the concept-space activations. They take in the concept_explainer and the latent_activations, compute the concept_activations and then compute the sparsity of the concept_activations.

from interpreto.concepts.metrics import MetricClass

metric = MetricClass(concept_explainer)
score = metric.compute(activations)

Sparsity

interpreto.concepts.metrics.Sparsity

Sparsity(concept_explainer, epsilon=0.0)

Code concepts/metrics/sparsity_metrics.py

Evaluates the sparsity of the concepts activations. It takes in the concept_explainer and the latent_activations, compute the concept_activations and then compute the sparsity of the concept_activations.

The sparsity is defined as: $$ \sum_{x}^{X} \sum_{i=1}^{cpt} \mathbb{1} ( | t(h(x))_i | > \epsilon ) $$ TODO: make the formula work

Attributes:

Name Type Description
concept_explainer ConceptEncoderExplainer

The explainer used to compute concepts.

epsilon float

The threshold used to compute the sparsity.

Source code in interpreto/concepts/metrics/sparsity_metrics.py
def __init__(self, concept_explainer: ConceptEncoderExplainer, epsilon: float = 0.0):
    self.concept_explainer = concept_explainer
    self.epsilon = epsilon

compute

Compute the metric.

Parameters:

Name Type Description Default

latent_activations

LatentActivations | dict[str, LatentActivations]

The latent activations.

required

Returns:

Name Type Description
float float

The metric.

Source code in interpreto/concepts/metrics/sparsity_metrics.py
def compute(self, latent_activations: LatentActivations | dict[str, LatentActivations]) -> float:
    """Compute the metric.

    Args:
        latent_activations (LatentActivations | dict[str, LatentActivations]): The latent activations.

    Returns:
        float: The metric.
    """
    split_latent_activations: LatentActivations = self.concept_explainer._sanitize_activations(latent_activations)

    concepts_activations: ConceptsActivations = self.concept_explainer.encode_activations(split_latent_activations)

    return torch.mean(torch.abs(concepts_activations) > self.epsilon, dtype=torch.float32).item()

Sparsity Ratio

interpreto.concepts.metrics.SparsityRatio

SparsityRatio(concept_explainer, epsilon=0.0)

Bases: Sparsity

Code concepts/metrics/sparsity_metrics.py

Evaluates the sparsity ratio of the concepts activations. It takes in the concept_explainer and the latent_activations, compute the concept_activations and then compute the sparsity ratio of the concept_activations.

With \(A\) latent activations obtained through \(A = h(X)\), the sparsity ratio is defined as: $$ (1 / cpt) * \sum_{a}^{A} \sum_{i=1}^{cpt} \mathbb{1} ( | t(a)_i | > \epsilon ) $$ TODO: make the formula work

Attributes:

Name Type Description
concept_explainer ConceptEncoderExplainer

The explainer used to compute concepts.

epsilon float

The threshold used to compute the sparsity.

Source code in interpreto/concepts/metrics/sparsity_metrics.py
def __init__(self, concept_explainer: ConceptEncoderExplainer, epsilon: float = 0.0):
    self.concept_explainer = concept_explainer
    self.epsilon = epsilon

compute

Compute the metric.

Parameters:

Name Type Description Default

latent_activations

LatentActivations | dict[str, LatentActivations]

The latent activations.

required

Returns:

Name Type Description
float float

The metric.

Source code in interpreto/concepts/metrics/sparsity_metrics.py
def compute(self, latent_activations: LatentActivations | dict[str, LatentActivations]) -> float:
    """Compute the metric.

    Args:
        latent_activations (LatentActivations | dict[str, LatentActivations]): The latent activations.

    Returns:
        float: The metric.
    """
    sparsity = super().compute(latent_activations)
    return sparsity / self.concept_explainer.concept_model.nb_concepts