kaira.models.image.Yilmaz2024DeepJSCCWZModel

Inheritance diagram for Yilmaz2024DeepJSCCWZModel
- class kaira.models.image.Yilmaz2024DeepJSCCWZModel(encoder: BaseModel, channel: BaseChannel, decoder: BaseModel, constraint: BaseConstraint, *args: Any, **kwargs: Any)[source]
Bases:
WynerZivModelA specialized Wyner-Ziv model for neural joint source-channel coding with side information. [Wyner and Ziv, 1976, Yilmaz et al., 2024].
This model implements the DeepJSCC-WZ architecture from Yilmaz et al. 2024, which applies deep learning techniques to the Wyner-Ziv coding paradigm (lossy compression with decoder-side information). The system is designed specifically for wireless image transmission scenarios where correlated side information is available at the receiver.
Unlike traditional separate source and channel coding approaches, DeepJSCC-WZ: 1. Jointly optimizes source compression and channel coding in an end-to-end manner 2. Adapts to varying channel conditions through explicit CSI conditioning 3. Exploits correlations between the transmitted signal and side information at the decoder 4. Provides graceful degradation under challenging channel conditions
Three model variants are supported: - Standard: Separate encoder/decoder with independent parameters (highest parameter count) - Small: Parameter-efficient design with shared encoder components - Conditional: Side information available at both encoder and decoder (performance upper bound)
The model automatically detects which variant is being used based on the encoder class.
Technical details: - Compression ratio: determined by channel dimension M and spatial downsampling (16× by default) - Channel adaptation: AFModule layers condition the network on current channel SNR - Side information fusion: Multi-scale fusion at multiple network layers at the decoder - Power normalization: Required constraint to ensure proper signal power scaling
- channel
Channel simulation model (e.g., AWGN, Rayleigh fading)
- Type:
- constraint
Signal power normalization constraint
- Type:
Methods
Initialize the Yilmaz2024DeepJSCCWZ model.
Execute the complete Wyner-Ziv coding process on the source image.
- __init__(encoder: BaseModel, channel: BaseChannel, decoder: BaseModel, constraint: BaseConstraint, *args: Any, **kwargs: Any)[source]
Initialize the Yilmaz2024DeepJSCCWZ model.
- Parameters:
encoder – Neural encoder model that compresses the source image. Must be one of the DeepJSCC-WZ encoder variants (standard, small, or conditional).
channel – Channel simulation model that applies noise and/or fading effects to the encoded representation during transmission.
decoder – Neural decoder model that reconstructs the image using received data and side information. Must match the encoder variant.
constraint – Power normalization constraint that ensures transmitted signals maintain appropriate power levels. This is crucial for fair comparisons across different models and transmission scenarios.
*args – Variable positional arguments passed to the base class.
**kwargs – Variable keyword arguments passed to the base class.
- forward(source: Tensor, side_info: Tensor | None = None, *args: Any, **kwargs: Any) Tensor[source]
Execute the complete Wyner-Ziv coding process on the source image.
This method implements the full DeepJSCC-WZ model: 1. Encodes the source image into a compact representation - For conditional models: utilizes side information during encoding - For non-conditional models: encodes without access to side information 2. Applies power normalization to the encoded representation 3. Simulates transmission through a noisy channel 4. Reconstructs the image using the received data and side information
All steps are differentiable, allowing for end-to-end training that jointly optimizes the entire transmission system for a given distortion metric and channel model.
- Parameters:
source – Source image tensor to encode and transmit, shape [B, C, H, W]. Typically RGB images with values normalized to [0,1].
side_info – Correlated side information available at the decoder, shape [B, C, H, W]. This could be a previous frame in a video, a low-resolution version, or other correlated information that helps in reconstruction.
*args – Additional positional arguments passed to internal components.
**kwargs – Additional keyword arguments passed to internal components. Must include ‘csi’ (torch.Tensor): Channel state information tensor of shape [B, 1, 1, 1]. Contains the signal-to-noise ratio (SNR) or other channel quality indicators that allow the model to adapt to current channel conditions.
- Returns:
The final reconstructed image tensor of shape [B, C, H, W].
- Return type:
- Raises:
ValueError – If side_info or csi is None, as these are required parameters.
Note
CSI values are typically provided in dB and should be normalized to an appropriate range as expected by the model’s training configuration.