Last updated: 2026-03-03
A code formatter for R, built on R’s parser. Formatting logic is implemented in both base R and C++ (via Rcpp) — the C++ path runs automatically and is ~85x faster; the R implementation serves as a readable reference and fallback.
rformat uses parse() and getParseData() to make formatting decisions
from the token stream and expression structure, not from regex or
indentation heuristics. All transforms operate on an enriched token
vector (C++) or DataFrame (R).
Installation
1remotes::install_github("cornball-ai/rformat")
Usage
1library(rformat)
2
3# Format a string
4rformat("x<-1+2")
5#> x <- 1 + 2
6
7# Format a file (overwrites in place)
8rformat_file("script.R")
9
10# Format all R files in a directory
11rformat_dir("R/")
12
13# Dry run
14rformat_file("script.R", dry_run = TRUE)
Example
1rformat("f=function(x,y){
2if(x>0)
3y=mean(x,na.rm=TRUE)
4else y=NA
5}")
1f <- function(x, y) {
2 if (x > 0)
3 y <- mean(x, na.rm = TRUE)
4 else y <- NA
5}
What it does
- Normalizes spacing around operators, commas, and keywords
- Indents by syntactic nesting depth
- Converts
=to<-for assignment (where the parser confirmsEQ_ASSIGN, notEQ_SUB) - Wraps long lines at logical operators and commas
- Wraps long function signatures with continuation indent
- Collapses short multi-line calls back to one line
- Preserves comments and strings exactly
- Removes trailing whitespace and excess blank lines
- Optionally adds braces to bare control-flow bodies
- Optionally expands inline if-else to multi-line
Options
| Parameter | Default | Description |
|---|---|---|
indent | 4L | Spaces per level, or a string like "\t" |
line_limit | 80L | Line width before wrapping |
wrap | "paren" | "paren" aligns to (, "fixed" uses 8-space continuation |
brace_style | "kr" | "kr": ){ same line. "allman": { on its own line |
control_braces | FALSE | Add braces to bare control-flow bodies |
expand_if | FALSE | Expand all inline if-else to multi-line |
else_same_line | TRUE | Repair top-level }\nelse parse error |
function_space | FALSE | Space before ( in function(x) |
join_else | TRUE | Move else to same line as } |
Defaults are derived from analysis of the 30 packages that ship with R.
Correctness
Parse preservation. If input parses, output parses. Token types and ordering are preserved. Strings and comments are never modified.
Semantic preservation. Only whitespace and style tokens change.
Assignment conversion and brace insertion are guided by parser token
types (EQ_ASSIGN vs EQ_SUB, structural body detection), so they
never change meaning.
Idempotency. rformat(rformat(x)) == rformat(x). Verified across
126 CRAN and base R packages with randomized parameter combinations
(indent, wrap, brace_style, control_braces, line_limit, etc.):
0 failures, 0 idempotency exceptions.
Stress testing
The stress test suite
formats every .R file from 126 packages (base, recommended, and
popular CRAN), checking that formatted code parses and that formatting
twice produces identical output. Tests run with randomized style
parameters to exercise all option combinations.
Architecture
The formatting pipeline has two implementations that produce identical output:
- R (
R/ast_*.R): Pure base R reference implementation. No compilation needed; readable source for understanding the algorithms. - C++ (
src/*.cpp): Rcpp fast path. Same algorithms, ~85x faster on typical files. Used automatically.
Both operate on the same token stream from parse() + getParseData():
enrich terminals with nesting depth, run transforms (collapse, wrap,
braces, etc.), then serialize back to text.
License
GPL-3
Reference
See Function Reference for complete API documentation.
Functions
rformat Reference
Function reference for rformat