Spectrogram

class paddle.audio.features. Spectrogram ( n_fft: int = 512, hop_length: Optional[int] = 512, win_length: Optional[int] = None, window: str = 'hann', power: float = 1.0, center: bool = True, pad_mode: str = 'reflect', dtype: str = 'float32' ) [source]

Compute spectrogram of given signals, typically audio waveforms. The spectrogram is defined as the complex norm of the short-time Fourier transformation.

Parameters
  • n_fft (int, optional) – The number of frequency components of the discrete Fourier transform. Defaults to 512.

  • hop_length (Optional[int], optional) – The hop length of the short time FFT. If None, it is set to win_length//4. Defaults to None.

  • win_length (Optional[int], optional) – The window length of the short time FFT. If None, it is set to same as n_fft. Defaults to None.

  • window (str, optional) – The window function applied to the signal before the Fourier transform. Supported window functions: ‘hamming’, ‘hann’, ‘kaiser’, ‘gaussian’, ‘exponential’, ‘triang’, ‘bohman’, ‘blackman’, ‘cosine’, ‘tukey’, ‘taylor’. Defaults to ‘hann’.

  • power (float, optional) – Exponent for the magnitude spectrogram. Defaults to 2.0.

  • center (bool, optional) – Whether to pad x to make that the \(t imes hop\_length\) at the center of t-th frame. Defaults to True.

  • pad_mode (str, optional) – Choose padding pattern when center is True. Defaults to ‘reflect’.

  • dtype (str, optional) – Data type of input and window. Defaults to ‘float32’.

Returns

Layer. An instance of Spectrogram.

Examples

>>> import paddle
>>> from paddle.audio.features import Spectrogram

>>> sample_rate = 16000
>>> wav_duration = 0.5
>>> num_channels = 1
>>> num_frames = sample_rate * wav_duration
>>> wav_data = paddle.linspace(-1.0, 1.0, num_frames) * 0.1
>>> waveform = wav_data.tile([num_channels, 1])

>>> feature_extractor = Spectrogram(n_fft=512, window = 'hann', power = 1.0)
>>> feats = feature_extractor(waveform)
forward ( x: paddle.Tensor ) paddle.Tensor

forward

Parameters

x (Tensor) – Tensor of waveforms with shape (N, T)

Returns

Spectrograms with shape (N, n_fft//2 + 1, num_frames).

Return type

Tensor