retrivex package

Submodules

retrivex.stransformers module

Explainability for vector similarity models available via SentenceTransformer.

class retrivex.stransformers.IntJacobianExplainableTransformer(model_name_or_path: str | None = None, modules: Iterable[Module] | None = None, device: str | None = None, prompts: dict[str, str] | None = None, default_prompt_name: str | None = None, similarity_fn_name: str | SimilarityFunction | None = None, cache_folder: str | None = None, trust_remote_code: bool = False, revision: str | None = None, local_files_only: bool = False, token: bool | str | None = None, use_auth_token: bool | str | None = None, truncate_dim: int | None = None, model_kwargs: dict[str, Any] | None = None, tokenizer_kwargs: dict[str, Any] | None = None, config_kwargs: dict[str, Any] | None = None, model_card_data: SentenceTransformerModelCardData | None = None, backend: Literal['torch', 'onnx', 'openvino'] = 'torch')

Bases: SentenceTransformer

Extended SentenceTransformer which provides vector similarity explainability through integrated gradients based on approach described in paper “Approximate Attributions for Off-the-Shelf Siamese Transformers” (Moeller et al., EACL 2024) https://aclanthology.org/2024.eacl-long.125/. This approach is an approximation of Integrated Jacobians (IJ) method (which is based on generalization of integrated gradients) with using padding sequences as approximate references that suppose to have similarity close to zero for most inputs.

This class enables attribution analysis by computing how different tokens contribute to the similarity between two sentences. It extends the base SentenceTransformer to include reference embeddings and attribution computation capabilities.

explain(query: str, candidate: str, similarity_metric: str = 'cosine', return_details: bool = False, move_to_cpu: bool = False, show_progress: bool = True, compress_embedding_dimension: bool = True, layer_index=0, num_interpolation_steps=250, encoder_layers_path: str | None = None, embeddings_layer_path: str | None = None) Tuple[Tensor, List[str], List[str], Dict | None]

Explain similarity between two texts using integrated gradients attribution.

This method computes token-level attribution scores that explain how each token pair in query and candidate texts contributes to the similarity between query and candidate.

Parameters:
  • query – Query text to compare

  • candidate – Candidate text to compare

  • similarity_measure – ‘cosine’ or ‘dot’ product similarity

  • return_details – Whether to return individual similarity components

  • move_to_cpu – Whether to move computations to CPU for memory efficiency

  • show_progress – Whether to show progress bars during computation

  • compress_embedding_dimension – Whether to sum over embedding dimensions

  • layer_index – Index of the encoder layer to use in analysis

  • num_interpolation_steps – Number of interpolation steps for integrated gradients

  • encoder_layers_path – Optional custom path to intermediate encoder layer

  • example ((for)

  • found ("encoder.emb") if embeddings layer could not be)

  • automatically.

  • embeddings_layer_path – Optional custom path to embeddings layer

  • example

  • found

  • automatically.

Returns:

  • attribution_matrix: Token-to-token attribution scores

  • tokens_query: Tokenized version of query text

  • tokens_candidate: Tokenized version of candidate text

  • (optional) similarity_score, reference_similarities if return_detailed_terms=True

Return type:

Tuple containing

forward(input_features: Dict[str, Tensor]) Dict[str, Tensor]

Forward pass that separates reference embeddings from sentence embeddings.

The input is expected to contain both sentence embeddings and a reference embedding concatenated together. This method splits them and adds the reference as a separate feature for later use in attribution analysis.

Parameters:

input_features – Dictionary containing tokenized input with embeddings and attention masks.

Returns:

Dictionary with sentence embeddings, attention masks, and separated reference embeddings

retrivex.visualization module

Visualization Utils Module.

This module provides visualization tools to understand the interactions between tokens in explanations.

retrivex.visualization.plot_connections(scores: array | tensor, query_tokens: List[str], candidate_tokens: List[str], details: Dict = None, norm_scores: bool = True, figsize: Tuple[int, int] = (5, 5), plot_title: str | None = None, query_label: str = 'query', candidate_label: str = 'candidate', token_canvas_width: int = 50, token_fontsize: int = 10, title_fontsize: int = 24, connection_line_thick: float = 3.0, title_pad: float = 10, plot_pad: float = 0.05) None

Create a parallel coordinate plot showing interactions between token pairs.

This visualization shows how features from one sequence relate to features in another sequence through curved connections colored by relevance scores.

Parameters:
  • scores – Token to token attribution scores matrix

  • query_tokens – Query tokens list

  • candidate_tokens – Candidate tokens list

  • details – Additional similarity terms

  • figsize – Size of the plot

  • norm_scores – Whether to normalize scores. Default value is True

  • plot_title – Optional title for the plot

  • query_label – Label for query sequence in a plot. Defaul value is query

  • candidate_label – Label for candidate sequence in a plot.

  • candidate (Default value is)

  • token_canvas_width – Base wisth fpr token in a plot. Default value is 50

  • token_fontsize – Font size of tokens

  • title_fontsize – Font size of plot title

  • connection_line_thick – Thickness of the connection lines between tokens.

  • 3.0 (Default value is)

  • title_pad – Padding for title. Default value is 10.0

  • plot_pad – Padding for a plot. Default value is 0.05

Module contents

Retrivex – Retrieval Models Explainability Toolkit