kaira.metrics.CompositeMetric

Inheritance diagram of CompositeMetric — Inheritance diagram for CompositeMetric

class kaira.metrics.CompositeMetric(metrics: Dict[str, BaseMetric], weights: Dict[str, float] | None = None, *args: Any, **kwargs: Any)[source]

Bases: BaseMetric

A metric that combines multiple metrics with optional weighting.

This class allows for the creation of custom evaluation metrics by combining multiple individual metrics with specified weights. It’s useful when a single metric doesn’t capture all the desired qualities of a comparison, such as combining perceptual and statistical image similarity measures.

The composite approach can balance the trade-offs between different metrics. For example, PSNR tends to favor smoothness, while perceptual metrics may favor visual sharpness. By combining them, you can create more balanced evaluation criteria.

Note

When combining metrics where some are “higher is better” and others are “lower is better”, you may need to invert certain metrics (e.g., by using negative weights or transforming the metric beforehand).

Example

>>> from kaira.metrics import PSNR, SSIM, LPIPS
>>> from kaira.metrics.composite import CompositeMetric
>>>
>>> # Create individual metrics
>>> psnr = PSNR()
>>> ssim = SSIM()
>>> lpips = LPIPS()
>>>
>>> # Create a composite metric with custom weights
>>> # Note: LPIPS is "lower is better" while PSNR and SSIM are "higher is better"
>>> metrics = {"psnr": psnr, "ssim": ssim, "lpips": lpips}
>>> weights = {"psnr": 0.3, "ssim": 0.3, "lpips": -0.4}  # Negative weight for LPIPS
>>> composite = CompositeMetric(metrics=metrics, weights=weights)
>>>
>>> # Evaluate images
>>> score = composite(prediction, target)
>>> individual_scores = composite.compute_individual(prediction, target)

Methods

`__init__`	Initialize composite metric with component metrics and their weights.
`add_metric`	Add a new metric to the composite metric.
`compute_individual`	Compute all individual metrics separately without combining them.
`compute_with_stats`	Compute metric with mean and standard deviation.
`forward`	Compute the weighted combination of all component metrics.

__init__(metrics: Dict[str, BaseMetric], weights: Dict[str, float] | None = None, *args: Any, **kwargs: Any)[source]

Initialize composite metric with component metrics and their weights.

Parameters:

metrics (Dict[str, BaseMetric]) – Dictionary mapping metric names to metric objects. Each metric should be a subclass of BaseMetric.
weights (Optional[Dict[str, float]]) –
Dictionary mapping metric names to their relative importance. If None, equal weights are assigned to all metrics. Weights are automatically normalized to sum to 1.0.

Use negative weights for metrics where lower values indicate better quality (e.g., LPIPS, MSE) when combining with metrics where higher values indicate better quality (e.g., PSNR, SSIM).
*args – Variable length argument list passed to the base class.
**kwargs – Arbitrary keyword arguments passed to the base class.

Raises:

ValueError – If weights dictionary contains keys that don’t exist in metrics

forward(x: Tensor, y: Tensor, *args: Any, **kwargs: Any) → Tensor[source]

Compute the weighted combination of all component metrics.

Evaluates each metric on the input tensors and combines them according to the normalized weights specified during initialization.

Note

If a metric returns a tuple (e.g., containing mean and std), only the first element (typically the mean) is used in the weighted combination. For more control, access individual metrics through compute_individual().

Parameters:

x (torch.Tensor) – First input tensor, typically the prediction or generated output
y (torch.Tensor) – Second input tensor, typically the target or ground truth
*args – Variable length argument list passed to individual metrics.
**kwargs – Arbitrary keyword arguments passed to individual metrics.

Returns:

Weighted sum of all metric values as a single scalar tensor.: The interpretation of this value depends on the constituent metrics and weights. With appropriate weighting, higher values typically indicate better results.

Return type:

torch.Tensor

compute_individual(x: Tensor, y: Tensor, *args: Any, **kwargs: Any) → Dict[str, Tensor][source]

Compute all individual metrics separately without combining them.

Unlike the forward method which returns a weighted combination, this method returns the raw value for each individual metric. This is useful for: - Debugging the contribution of individual metrics - Creating custom visualizations or reports - Applying post-processing to individual metrics before combining them - Evaluating metrics with different criteria that cannot be combined directly

Parameters:

x (torch.Tensor) – First input tensor, typically the prediction or generated output
y (torch.Tensor) – Second input tensor, typically the target or ground truth
*args – Variable length argument list passed to individual metrics.
**kwargs – Arbitrary keyword arguments passed to individual metrics.

Returns:

Dictionary mapping metric names to their computed values.: May contain tuple values (e.g., mean and std) for metrics that return multiple values. The interpretation of values (higher/lower is better) depends on the specific metric.

Return type:

Dict[str, torch.Tensor]

compute_with_stats(x: Tensor, y: Tensor, *args: Any, **kwargs: Any) → Tuple[Tensor, Tensor]

Compute metric with mean and standard deviation.

Parameters:

x (torch.Tensor) – The first input tensor (typically predictions)
y (torch.Tensor) – The second input tensor (typically targets)
*args – Variable length argument list.
**kwargs – Arbitrary keyword arguments.

Returns:

Mean and standard deviation of the metric

Return type:

Tuple[torch.Tensor, torch.Tensor]

add_metric(name: str, metric: BaseMetric, weight: float | None = None) → None[source]

Add a new metric to the composite metric.

Parameters:

name (str) – Name of the metric to add
metric (BaseMetric) – The metric object to add
weight (Optional[float], optional) – Weight for the new metric. If None, will use 1.0 and renormalize all weights. Defaults to None.

Raises:

ValueError – If a metric with the given name already exists