Optimization-based Dictionary Learning¶
List of available methods¶
interpreto.concepts.NMFConcepts
¶
NMFConcepts(model_with_split_points, *, nb_concepts, split_point=None, device='cpu', force_relu=False, **kwargs)
Bases: DictionaryLearningExplainer[NMF]
Code: concepts/methods/overcomplete.py
ConceptAutoEncoderExplainer
with the NMF from Lee and Seung (1999)1 as concept model.
NMF implementation from overcomplete.optimization.NMF class.
-
Lee, D., Seung, H. Learning the parts of objects by non-negative matrix factorization. Nature, 401, 1999, pp. 788–791. ↩
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_with_split_points
|
ModelWithSplitPoints
|
The model to apply the explanation on. It should have at least one split point on which a concept explainer can be trained. |
required |
nb_concepts
|
int
|
Size of the SAE concept space. |
required |
split_point
|
str | None
|
The split point used to train the |
None
|
device
|
device | str
|
Device to use for the |
'cpu'
|
force_relu
|
bool
|
Whether to force the activations to be positive. |
False
|
**kwargs
|
dict
|
Additional keyword arguments to pass to the |
{}
|
encode_activations
¶
Encode the given activations using the concept_model
encoder.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
activations
|
LatentActivations
|
The activations to encode. |
required |
Returns:
Type | Description |
---|---|
Tensor
|
The encoded concept activations. |
interpreto.concepts.SemiNMFConcepts
¶
Bases: DictionaryLearningExplainer[SemiNMF]
Code: concepts/methods/overcomplete.py
ConceptAutoEncoderExplainer
with the SemiNMF from Ding et al. (2008)1 as concept model.
SemiNMF implementation from overcomplete.optimization.SemiNMF class.
-
C. H. Q. Ding, T. Li and M. I. Jordan, Convex and Semi-Nonnegative Matrix Factorizations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(1), 2010, pp. 45-55 ↩
interpreto.concepts.ConvexNMFConcepts
¶
ConvexNMFConcepts(model_with_split_points, *, nb_concepts, split_point=None, device='cpu', **kwargs)
Bases: DictionaryLearningExplainer[ConvexNMF]
Code: concepts/methods/overcomplete.py
ConceptAutoEncoderExplainer
with the ConvexNMF from Ding et al. (2008)1 as concept model.
ConvexNMF implementation from overcomplete.optimization.ConvexNMF class.
-
C. H. Q. Ding, T. Li and M. I. Jordan, Convex and Semi-Nonnegative Matrix Factorizations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(1), 2010, pp. 45-55 ↩
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_with_split_points
|
ModelWithSplitPoints
|
The model to apply the explanation on. It should have at least one split point on which a concept explainer can be trained. |
required |
nb_concepts
|
int
|
Size of the SAE concept space. |
required |
split_point
|
str | None
|
The split point used to train the |
None
|
device
|
device | str
|
Device to use for the |
'cpu'
|
**kwargs
|
dict
|
Additional keyword arguments to pass to the |
{}
|
interpreto.concepts.PCAConcepts
¶
Bases: DictionaryLearningExplainer[SkPCA]
Code: concepts/methods/overcomplete.py
ConceptAutoEncoderExplainer
with the PCA from Pearson (1901)1 as concept model.
PCA implementation from overcomplete.optimization.SkPCA class.
-
K. Pearson, On lines and planes of closest fit to systems of points in space. Philosophical Magazine, 2(11), 1901, pp. 559-572. ↩
interpreto.concepts.ICAConcepts
¶
Bases: DictionaryLearningExplainer[SkICA]
Code: concepts/methods/overcomplete.py
ConceptAutoEncoderExplainer
with the ICA from Hyvarinen and Oja (2000)1 as concept model.
ICA implementation from overcomplete.optimization.SkICA class.
-
A. Hyvarinen and E. Oja, Independent Component Analysis: Algorithms and Applications, Neural Networks, 13(4-5), 2000, pp. 411-430. ↩
interpreto.concepts.KMeansConcepts
¶
Bases: DictionaryLearningExplainer[SkKMeans]
Code: concepts/methods/overcomplete.py
ConceptAutoEncoderExplainer
with the K-Means as concept model.
K-Means implementation from overcomplete.optimization.SkKMeans class.
interpreto.concepts.SparsePCAConcepts
¶
SparsePCAConcepts(model_with_split_points, *, nb_concepts, split_point=None, device='cpu', **kwargs)
Bases: DictionaryLearningExplainer[SkSparsePCA]
Code: concepts/methods/overcomplete.py
ConceptAutoEncoderExplainer
with SparsePCA as concept model.
SparsePCA implementation from overcomplete.optimization.SkSparsePCA class.
interpreto.concepts.SVDConcepts
¶
Bases: DictionaryLearningExplainer[SkSVD]
Code: concepts/methods/overcomplete.py
ConceptAutoEncoderExplainer
with SVD as concept model.
SVD implementation from overcomplete.optimization.SkSVD class.
interpreto.concepts.DictionaryLearningConcepts
¶
DictionaryLearningConcepts(model_with_split_points, *, nb_concepts, split_point=None, device='cpu', **kwargs)
Bases: DictionaryLearningExplainer[SkDictionaryLearning]
Code: concepts/methods/overcomplete.py
ConceptAutoEncoderExplainer
with the Dictionary Learning concepts from Mairal et al. (2009)1 as concept model.
Dictionary Learning implementation from overcomplete.optimization.SkDictionaryLearning class.
-
J. Mairal, F. Bach, J. Ponce, G. Sapiro, Online dictionary learning for sparse coding Proceedings of the 26th Annual International Conference on Machine Learning, 2009, pp. 689-696. ↩
Base abstract class¶
interpreto.concepts.methods.DictionaryLearningExplainer
¶
DictionaryLearningExplainer(model_with_split_points, *, nb_concepts, split_point=None, device='cpu', **kwargs)
Bases: ConceptAutoEncoderExplainer[BaseOptimDictionaryLearning]
, Generic[_BODL_co]
Code: concepts/methods/overcomplete.py
Implementation of a concept explainer using an
overcomplete.optimization.BaseOptimDictionaryLearning
(NMF and PCA variants) as concept_model
.
Attributes:
Name | Type | Description |
---|---|---|
model_with_split_points |
ModelWithSplitPoints
|
The model to apply the explanation on.
It should have at least one split point on which |
split_point |
str | None
|
The split point used to train the |
concept_model |
SAE
|
An Overcomplete BaseOptimDictionaryLearning variant for concept extraction. |
is_fitted |
bool
|
Whether the |
has_differentiable_concept_encoder |
bool
|
Whether the |
has_differentiable_concept_decoder |
bool
|
Whether the |
Parameters:
Name | Type | Description | Default |
---|---|---|---|
|
ModelWithSplitPoints
|
The model to apply the explanation on. It should have at least one split point on which a concept explainer can be trained. |
required |
|
int
|
Size of the SAE concept space. |
required |
|
str | None
|
The split point used to train the |
None
|
|
device | str
|
Device to use for the |
'cpu'
|
|
dict
|
Additional keyword arguments to pass to the |
{}
|
Source code in interpreto/concepts/methods/overcomplete.py
fit
¶
fit(activations, *, overwrite=False, **kwargs)
Fit an Overcomplete OptimDictionaryLearning model on the given activations.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
|
Tensor | dict[str, Tensor]
|
The activations used for fitting the |
required |
|
bool
|
Whether to overwrite the current model if it has already been fitted. Default: False. |
False
|
|
dict
|
Additional keyword arguments to pass to the |
{}
|
Source code in interpreto/concepts/methods/overcomplete.py
encode_activations
¶
encode_activations(activations)
Encode the given activations using the concept_model
encoder.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
|
LatentActivations
|
The activations to encode. |
required |
Returns:
Type | Description |
---|---|
Tensor
|
The encoded concept activations. |
Source code in interpreto/concepts/base.py
decode_concepts
¶
decode_concepts(concepts)
Decode the given concepts using the concept_model
decoder.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
|
ConceptsActivations
|
The concepts to decode. |
required |
Returns:
Type | Description |
---|---|
Tensor
|
The decoded model activations. |
Source code in interpreto/concepts/base.py
get_dictionary
¶
Get the dictionary learned by the fitted concept_model
.
Returns:
Type | Description |
---|---|
Tensor
|
torch.Tensor: A |
Source code in interpreto/concepts/base.py
interpret
¶
interpret(interpretation_method, concepts_indices, inputs=None, latent_activations=None, concepts_activations=None, **kwargs)
Interpret the concepts dimensions in the latent space into a human-readable format. The interpretation is a mapping between the concepts indices and an object allowing to interpret them. It can be a label, a description, examples, etc.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
|
type[BaseConceptInterpretationMethod]
|
The interpretation method to use to interpret the concepts. |
required |
|
int | list[int] | Literal['all']
|
The indices of the concepts to interpret. If "all", all concepts are interpreted. |
required |
|
list[str] | None
|
The inputs to use for the interpretation.
Necessary if the source is not |
None
|
|
LatentActivations | dict[str, LatentActivations] | None
|
The latent activations to use for the interpretation.
Necessary if the source is |
None
|
|
ConceptsActivations | None
|
The concepts activations to use for the interpretation.
Necessary if the source is not |
None
|
|
Additional keyword arguments to pass to the interpretation method. |
{}
|
Returns:
Type | Description |
---|---|
Mapping[int, Any]
|
Mapping[int, Any]: A mapping between the concepts indices and the interpretation of the concepts. |
Source code in interpreto/concepts/base.py
input_concept_attribution
¶
input_concept_attribution(inputs, concept, attribution_method, **attribution_kwargs)
Attributes model inputs for a selected concept.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
|
ModelInputs
|
The input data, which can be a string, a list of tokens/words/clauses/sentences or a dataset. |
required |
|
int
|
Index identifying the position of the concept of interest (score in the
|
required |
|
type[AttributionExplainer]
|
The attribution method to obtain importance scores for input elements. |
required |
Returns:
Type | Description |
---|---|
list[float]
|
A list of attribution scores for each input. |
Source code in interpreto/concepts/base.py
concept_output_attribution
¶
concept_output_attribution(inputs, concepts, target, attribution_method, **attribution_kwargs)
Computes the attribution of each concept for the logit of a target output element.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
|
ModelInputs
|
An input data-point for the model. |
required |
|
Tensor
|
Concept activation tensor. |
required |
|
int
|
The target class for which the concept output attribution should be computed. |
required |
|
type[AttributionExplainer]
|
The attribution method to obtain importance scores for input elements. |
required |
Returns:
Type | Description |
---|---|
list[float]
|
A list of attribution scores for each concept. |