whisper

Last updated: 2026-02-06

Native R torch implementation of OpenAI Whisper for speech-to-text transcription.

Installation

1install.packages("whisper")

Or install the development version from GitHub:

1remotes::install_github("cornball-ai/whisper")

Quick Start

1library(whisper)
2
3# Transcribe the bundled JFK "Ask not" speech (prompts to download model on first use)
4jfk <- system.file("audio", "jfk.mp3", package = "whisper")
5result <- transcribe(jfk)
6result$text
7#> "Ask not what your country can do for you, ask what you can do for your country."

On first use, you’ll be prompted to download the model:

1Download 'tiny' model (~151 MB) from HuggingFace? (Yes/no/cancel)

Model Management

 1# Download a model explicitly
 2download_whisper_model("tiny")
 3
 4# List available models
 5list_whisper_models()
 6#> [1] "tiny" "base" "small" "medium" "large-v3"
 7
 8# Check which models are downloaded
 9list_downloaded_models()
10
11# Check if a specific model exists locally
12model_exists("tiny")

Usage

 1# Basic transcription
 2result <- transcribe("audio.wav")
 3print(result$text)
 4
 5# Specify model size
 6result <- transcribe("audio.wav", model = "small")
 7
 8# Force CPU (useful if CUDA has issues)
 9result <- transcribe("audio.wav", device = "cpu")
10
11# Non-English audio (specify language for better accuracy)
12allende <- system.file("audio", "allende.mp3", package = "whisper")
13result <- transcribe(allende, language = "es")
14
15# Translate to English (quality is model-dependent; larger models work better)
16result <- transcribe(allende, task = "translate", language = "es", model = "small")

Models

Model	Parameters	Size	English WER
tiny	39M	151 MB	~9%
base	74M	290 MB	~7%
small	244M	967 MB	~5%
medium	769M	3.0 GB	~4%
large-v3	1550M	6.2 GB	~3%

Models are downloaded from HuggingFace and cached in ~/.cache/huggingface/ unless otherwise specified.

License

MIT

Functions

audio_to_mel
whisper::audio_to_mel
download_whisper_model
whisper::download_whisper_model
list_downloaded_models
whisper::list_downloaded_models
list_whisper_models
whisper::list_whisper_models
load_audio
whisper::load_audio
load_whisper_model
whisper::load_whisper_model
model_exists
whisper::model_exists
transcribe
whisper::transcribe
whisper_config
whisper::whisper_config
whisper_device
whisper::whisper_device
whisper_dtype
whisper::whisper_dtype
whisper_tokenizer
whisper::whisper_tokenizer