Last updated: 2026-01-27
Native R torch implementation of OpenAI Whisper for speech-to-text transcription.
Installation
1# Install dependencies
2install.packages(c("torch", "hfhub", "safetensors", "av", "jsonlite"))
3
4# Install whisper from GitHub
5remotes::install_github("cornball-ai/whisper")
Quick Start
1library(whisper)
2
3# Transcribe the bundled JFK "Ask not" speech (prompts to download model on first use)
4jfk <- system.file("audio", "jfk.mp3", package = "whisper")
5result <- transcribe(jfk)
6result$text
7#> "Ask not what your country can do for you, ask what you can do for your country."
On first use, you’ll be prompted to download the model:
1Download 'tiny' model (~151 MB) from HuggingFace? (Yes/no/cancel)
Model Management
1# Download a model explicitly
2download_whisper_model("tiny")
3
4# List available models
5list_whisper_models()
6#> [1] "tiny" "base" "small" "medium" "large-v3"
7
8# Check which models are downloaded
9list_downloaded_models()
10
11# Check if a specific model exists locally
12model_exists("tiny")
Usage
1# Basic transcription
2result <- transcribe("audio.wav")
3print(result$text)
4
5# Specify model size
6result <- transcribe("audio.wav", model = "small")
7
8# Force CPU (useful if CUDA has issues)
9result <- transcribe("audio.wav", device = "cpu")
10
11# Non-English audio (specify language for better accuracy)
12allende <- system.file("audio", "allende.mp3", package = "whisper")
13result <- transcribe(allende, language = "es")
14
15# Translate to English (quality is model-dependent; larger models work better)
16result <- transcribe(allende, task = "translate", language = "es", model = "small")
Models
| Model | Parameters | Size | English WER |
|---|---|---|---|
| tiny | 39M | 151 MB | ~9% |
| base | 74M | 290 MB | ~7% |
| small | 244M | 967 MB | ~5% |
| medium | 769M | 3.0 GB | ~4% |
| large-v3 | 1550M | 6.2 GB | ~3% |
Models are downloaded from HuggingFace and cached in ~/.cache/huggingface/ unless otherwise specified.
License
MIT
Functions
audio_to_mel
whisper::audio_to_meldownload_whisper_model
whisper::download_whisper_modellist_downloaded_models
whisper::list_downloaded_modelslist_whisper_models
whisper::list_whisper_modelsload_audio
whisper::load_audioload_whisper_model
whisper::load_whisper_modelmodel_exists
whisper::model_existstranscribe
whisper::transcribewhisper_config
whisper::whisper_configwhisper_device
whisper::whisper_devicewhisper_dtype
whisper::whisper_dtypewhisper_tokenizer
whisper::whisper_tokenizer