Concepts Reconstruction Metrics¶

Concept reconstruction error measures the faithfulness of the concept space with respect to the explained model latent space. To do so, these metrics compute the distance between initial activations and the reconstructed activations.

from interpreto.concepts.metrics import MetricClass

metric = MetricClass(concept_explainer)
score = metric.compute(activations)

MSE¶

interpreto.concepts.metrics.MSE ¶

MSE(concept_explainer)

Bases: ReconstructionError

Code concepts/metrics/reconstruction_metrics.py

Evaluates wether the information reconstructed by the concept autoencoder corresponds to the original latent activations. It is a faithfulness metric. It is computed in the latent activations space through the Euclidean distance. It is also known as the reconstruction error.

With $A$ latent activations obtained through $A = h(X)$, $t$ and $t^{-1}$ the concept encoder and decoders, the MSE is defined as: $$ \sum_{a}^{A} ||t^{-1}(t(a)) - a||_2 $$

TODO: make formula work

Attributes:

Name	Type	Description
`concept_explainer`	`ConceptAutoEncoderExplainer`	The explainer used to compute concepts.

Source code in interpreto/concepts/metrics/reconstruction_metrics.py

def __init__(
    self,
    concept_explainer: ConceptAutoEncoderExplainer,
):
    super().__init__(
        concept_explainer=concept_explainer,
        reconstruction_space=ReconstructionSpaces.LATENT_ACTIVATIONS,
        distance_function=DistanceFunctions.EUCLIDEAN,
    )

compute ¶

compute(latent_activations)

Compute the reconstruction error.

Parameters:

Name	Type	Description	Default
`latent_activations` ¶	`LatentActivations \| dict[str, LatentActivations]`	The latent activations to use for the computation.	required

Returns:

Name	Type	Description
`float`	`float`	The reconstruction error.

Source code in interpreto/concepts/metrics/reconstruction_metrics.py

def compute(self, latent_activations: LatentActivations | dict[str, LatentActivations]) -> float:
    """Compute the reconstruction error.

    Args:
        latent_activations (LatentActivations | dict[str, LatentActivations]): The latent activations to use for the computation.

    Returns:
        float: The reconstruction error.
    """
    split_latent_activations: LatentActivations = self.concept_explainer._sanitize_activations(latent_activations)

    concepts_activations: ConceptsActivations = self.concept_explainer.encode_activations(split_latent_activations)

    reconstructed_latent_activations: LatentActivations = self.concept_explainer.decode_concepts(
        concepts_activations
    )

    if self.reconstruction_space is ReconstructionSpaces.LATENT_ACTIVATIONS:
        return self.distance_function(split_latent_activations, reconstructed_latent_activations).item()

    raise NotImplementedError("Only LATENT_ACTIVATIONS reconstruction space is supported.")

FID¶

interpreto.concepts.metrics.FID ¶

FID(concept_explainer)

Bases: ReconstructionError

Code concepts/metrics/reconstruction_metrics.py

Evaluates wether the information reconstructed by the concept autoencoder corresponds to the original latent activations. It corresponds to a faithfulness metric, it measures if the reconstructed distribution matches the original distribution. It is computed in the latent activations space through the Wasserstein 1D distance.

This metric was introduced by Fel et al. (2023)¹

With $A$ latent activations obtained through $A = h(X)$, $t$ and $t^{-1}$ the concept encoder and decoders, and $\mathcal{W}_1$ the 1-Wassertein distance, the FID is defined as:

\[ \mathcal{W}_1(A, t^{-1}(t(A))) \]

TODO: make formula work

Fel, T., Boutin, V., Béthune, L., Cadène, R., Moayeri, M., Andéol, L., Chavidal, M., & Serre, T. A holistic approach to unifying automatic concept extraction and concept importance estimation. Advances in Neural Information Processing Systems. 2023. ↩

Attributes:

Name	Type	Description
`concept_explainer`	`ConceptAutoEncoderExplainer`	The explainer used to compute concepts.

Source code in interpreto/concepts/metrics/reconstruction_metrics.py

def __init__(
    self,
    concept_explainer: ConceptAutoEncoderExplainer,
):
    super().__init__(
        concept_explainer=concept_explainer,
        reconstruction_space=ReconstructionSpaces.LATENT_ACTIVATIONS,
        distance_function=DistanceFunctions.WASSERSTEIN_1D,
    )

compute ¶

compute(latent_activations)

Compute the reconstruction error.

Parameters:

Name	Type	Description	Default
`latent_activations` ¶	`LatentActivations \| dict[str, LatentActivations]`	The latent activations to use for the computation.	required

Returns:

Name	Type	Description
`float`	`float`	The reconstruction error.

Source code in interpreto/concepts/metrics/reconstruction_metrics.py

def compute(self, latent_activations: LatentActivations | dict[str, LatentActivations]) -> float:
    """Compute the reconstruction error.

    Args:
        latent_activations (LatentActivations | dict[str, LatentActivations]): The latent activations to use for the computation.

    Returns:
        float: The reconstruction error.
    """
    split_latent_activations: LatentActivations = self.concept_explainer._sanitize_activations(latent_activations)

    concepts_activations: ConceptsActivations = self.concept_explainer.encode_activations(split_latent_activations)

    reconstructed_latent_activations: LatentActivations = self.concept_explainer.decode_concepts(
        concepts_activations
    )

    if self.reconstruction_space is ReconstructionSpaces.LATENT_ACTIVATIONS:
        return self.distance_function(split_latent_activations, reconstructed_latent_activations).item()

    raise NotImplementedError("Only LATENT_ACTIVATIONS reconstruction space is supported.")

Custom¶

To tune the reconstruction error to your need, you can specify a reconstruction_space and a distance_function, note that the distance_function should follow the interpreto.commons.DistanceFunctionProtocol protocol.

interpreto.concepts.metrics.ReconstructionError ¶

ReconstructionError(concept_explainer, reconstruction_space, distance_function)

Code concepts/metrics/reconstruction_metrics.py

Evaluates wether the information reconstructed by the concept autoencoder corresponds to the original latent activations. It corresponds to a faithfulness metric. The space where the distance thus error is computed and the distance function used can be specified.

Attributes:

Name	Type	Description
`concept_explainer`	`ConceptAutoEncoderExplainer`	The explainer used to compute concepts.
`reconstruction_space`	`ReconstructionSpaces`	The space in which the reconstruction error is computed.
`distance_function`	`DistanceFunctionProtocol`	The distance function used to compute the reconstruction error.

Source code in interpreto/concepts/metrics/reconstruction_metrics.py

def __init__(
    self,
    concept_explainer: ConceptAutoEncoderExplainer,
    reconstruction_space: ReconstructionSpaces,
    distance_function: DistanceFunctionProtocol,
):
    self.concept_explainer = concept_explainer
    self.reconstruction_space = reconstruction_space
    self.distance_function = distance_function

compute ¶

compute(latent_activations)

Compute the reconstruction error.

Parameters:

Name	Type	Description	Default
`latent_activations` ¶	`LatentActivations \| dict[str, LatentActivations]`	The latent activations to use for the computation.	required

Returns:

Name	Type	Description
`float`	`float`	The reconstruction error.

Source code in interpreto/concepts/metrics/reconstruction_metrics.py

def compute(self, latent_activations: LatentActivations | dict[str, LatentActivations]) -> float:
    """Compute the reconstruction error.

    Args:
        latent_activations (LatentActivations | dict[str, LatentActivations]): The latent activations to use for the computation.

    Returns:
        float: The reconstruction error.
    """
    split_latent_activations: LatentActivations = self.concept_explainer._sanitize_activations(latent_activations)

    concepts_activations: ConceptsActivations = self.concept_explainer.encode_activations(split_latent_activations)

    reconstructed_latent_activations: LatentActivations = self.concept_explainer.decode_concepts(
        concepts_activations
    )

    if self.reconstruction_space is ReconstructionSpaces.LATENT_ACTIVATIONS:
        return self.distance_function(split_latent_activations, reconstructed_latent_activations).item()

    raise NotImplementedError("Only LATENT_ACTIVATIONS reconstruction space is supported.")

interpreto.concepts.metrics.ReconstructionSpaces ¶

Bases: Enum

Enumeration of possible reconstruction spaces. Latent activations go through the concept autoencoder to obtain reconstructed latent activations. Then it is possible to compute the distance between the original and reconstructed latent activations. First directly in the latent space, second in the logits space.

Attributes:

Name	Type	Description
`LATENT_ACTIVATIONS`	`str`	Reconstruction space in the latent space.
`LOGITS`	`str`	Reconstruction space in the logits space.

interpreto.commons.DistanceFunctions ¶

Bases: Enum

Enum of callable functions for computing distances between tensors.

Members

WASSERSTEIN_1D: Computes the 1D Wasserstein (earth mover's) distance between two tensors. EUCLIDEAN: Computes the Euclidean (L2) distance between two tensors. AVERAGE_EUCLIDEAN: Computes the average Euclidean distance between two tensors of samples. LP: Computes the Lp distance (generalization of Euclidean) between two tensors. AVERAGE_LP: Computes the average Lp distance between two tensors of samples. KL: Computes the Kullback-Leibler divergence between two tensors.

Concepts Reconstruction Metrics¶

MSE¶

interpreto.concepts.metrics.MSE ¶

compute ¶

`latent_activations` ¶

FID¶

interpreto.concepts.metrics.FID ¶

compute ¶

`latent_activations` ¶

Custom¶

interpreto.concepts.metrics.ReconstructionError ¶

compute ¶

`latent_activations` ¶

interpreto.concepts.metrics.ReconstructionSpaces ¶

interpreto.commons.DistanceFunctions ¶

Concepts Reconstruction Metrics¶

MSE¶

interpreto.concepts.metrics.MSE ¶

compute ¶

latent_activations ¶

FID¶

interpreto.concepts.metrics.FID ¶

compute ¶

latent_activations ¶

Custom¶

interpreto.concepts.metrics.ReconstructionError ¶

compute ¶

latent_activations ¶

interpreto.concepts.metrics.ReconstructionSpaces ¶

interpreto.commons.DistanceFunctions ¶

`latent_activations` ¶

`latent_activations` ¶

`latent_activations` ¶