kaira.losses.audio.MelSpectrogramLoss

Inheritance diagram of MelSpectrogramLoss

Inheritance diagram for MelSpectrogramLoss

class kaira.losses.audio.MelSpectrogramLoss(sample_rate=22050, n_fft=1024, hop_length=256, n_mels=80, f_min=0.0, f_max=8000.0, log_mel=True)[source]

Bases: BaseLoss

Mel-Spectrogram Loss Module.

This module calculates the loss between mel-spectrograms of input and target audio.

Methods

__init__

Initialize the MelSpectrogramLoss module.

forward

Forward pass through the MelSpectrogramLoss module.

__init__(sample_rate=22050, n_fft=1024, hop_length=256, n_mels=80, f_min=0.0, f_max=8000.0, log_mel=True)[source]

Initialize the MelSpectrogramLoss module.

Parameters:
  • sample_rate (int) – Audio sample rate. Default is 22050.

  • n_fft (int) – FFT size. Default is 1024.

  • hop_length (int) – Hop size. Default is 256.

  • n_mels (int) – Number of mel bands. Default is 80.

  • f_min (float) – Minimum frequency. Default is 0.0.

  • f_max (float) – Maximum frequency. Default is 8000.0.

  • log_mel (bool) – Whether to use log-mel spectrogram. Default is True.

forward(x: Tensor, target: Tensor) Tensor[source]

Forward pass through the MelSpectrogramLoss module.

Parameters:
Returns:

The mel-spectrogram loss.

Return type:

torch.Tensor