audio_to_mel

Convert Audio to Mel Spectrogram

Description

Main preprocessing function that converts audio to the mel spectrogram format expected by Whisper.

Usage

1audio_to_mel(file, n_mels = 80L, device = "auto", dtype = "auto")

Arguments

file: Path to audio file, or numeric vector of audio samples
n_mels: Number of mel bins (80 for most models, 128 for large-v3)
device: torch device for output tensor
dtype: torch dtype for output tensor

Value

torch tensor of shape (1, n_mels, 3000) for 30s audio