How to Use modify() in R
Introduction
The modify() function transforms each element of a list, vector, or data frame while keeping the same structure and type as the input. It’s the shape-preserving cousin of map().
The key difference: map() always returns a list. modify() returns the same type you gave it — a data frame stays a data frame, an atomic vector stays an atomic vector.
When to use modify():
- Transform every column of a data frame in place
- Apply a function across an atomic vector and keep it atomic
- Update only some elements (with
modify_if()ormodify_at()) - Avoid the noise of
map() |> as_tibble()round-trips
modify() vs map()
| Function | Always Returns | Use When |
|---|---|---|
map() |
List | You want a list back |
modify() |
Same type as input | You want to preserve structure |
A data frame in, a data frame out. A numeric vector in, a numeric vector out.
Getting Started
library(tidyverse)
library(palmerpenguins)Basic modify(): Transform a Vector
When given an atomic vector, modify() returns an atomic vector of the same type:
x <- c(1, 2, 3, 4, 5)
modify(x, \(i) i * 10)
# 10 20 30 40 50 (numeric)Compare with map(), which would return a list of length 5. modify() keeps it as a plain numeric vector.
modify() on a Data Frame
This is where modify() really earns its keep. A data frame is technically a list of columns, so modify() applies the function to each column and returns a data frame.
Square every column
df <- tibble(a = 1:3, b = 4:6, c = 7:9)
modify(df, \(col) col^2)| a | b | c |
|---|---|---|
| 1 | 16 | 49 |
| 4 | 25 | 64 |
| 9 | 36 | 81 |
The output is still a tibble. With map() you’d have to call as_tibble() afterwards.
Round all numeric columns
penguins |>
drop_na() |>
select(where(is.numeric)) |>
modify(round)Every column gets rounded; the data frame structure is preserved.
modify_if(): Conditional Transformation
Often you only want to transform some columns. modify_if() takes a predicate and only modifies elements where it returns TRUE.
Convert character columns to factors
penguins |>
modify_if(is.character, as.factor)Numeric columns are left alone; only character columns are touched. The data frame stays a data frame.
Scale numeric columns
penguins |>
drop_na() |>
modify_if(is.numeric, \(x) (x - mean(x)) / sd(x))This standardizes every numeric column without disturbing the species or island columns.
modify_at(): Modify Specific Positions or Names
When you know exactly which columns to change, use modify_at() and pass names or positions.
By name
penguins |>
modify_at(c("body_mass_g", "flipper_length_mm"), \(x) x / 10)Only those two columns are divided by 10. Everything else passes through untouched.
By position
modify_at(df, c(1, 3), \(x) x * 100)The first and third columns get multiplied by 100; the second is unchanged.
With tidyselect
modify_at() accepts tidyselect helpers, so you can write column patterns:
penguins |>
modify_at(starts_with("bill_"), \(x) round(x, 1))modify_in(): Update a Nested Element
For deeply nested lists, modify_in() updates a value at a specific path:
config <- list(
db = list(host = "localhost", port = 5432),
cache = list(ttl = 60)
)
modify_in(config, list("db", "port"), \(p) p + 1)
# config$db$port is now 5433; everything else identicalThis is much cleaner than reassigning config$db$port <- config$db$port + 1, especially in functional pipelines.
Practical Example: Clean a Messy Data Frame
A common cleanup task: trim whitespace from all character columns, round all numerics. modify_if() handles both in two short lines.
messy <- tibble(
name = c(" Alice ", " Bob"),
age = c(30.456, 25.789),
city = c("NYC ", " LA")
)
clean <- messy |>
modify_if(is.character, str_trim) |>
modify_if(is.numeric, round)
clean| name | age | city |
|---|---|---|
| Alice | 30 | NYC |
| Bob | 26 | LA |
The result is still a tibble with the same columns — no as_tibble() reconstruction needed.
modify() with mutate(): Which to Use?
Both transform columns, so when do you reach for which?
Use mutate() when… |
Use modify() when… |
|---|---|
| You name a few columns | You apply the same function to many columns |
| You compute new columns | You’re transforming in place |
| Each column gets a different formula | Every (or every matching) column gets the same logic |
For most data-frame work mutate(across(...)) is idiomatic. modify() shines when you’re working with a list-like structure or want a quick “apply to every element” without thinking about scoping.
Common Mistakes
1. Expecting modify() to change the type
modify() preserves type. If your function returns a different type than the input, you’ll get an error:
x <- c(1, 2, 3)
# Error - can't put characters into a numeric vector
# modify(x, as.character)
# Use map_chr() instead
map_chr(x, as.character)2. Confusing modify() with mutate()
modify() operates on each column of a data frame independently. It doesn’t see other columns. If you need cross-column logic, use mutate().
3. Forgetting modify_if() needs a predicate function
# Wrong - is.numeric() called instead of passed
# modify_if(df, is.numeric(), round)
# Right - pass the function itself
modify_if(df, is.numeric, round)Summary
| Function | Selects | Use Case |
|---|---|---|
modify() |
All elements | Transform everything in place |
modify_if() |
Predicate is TRUE | Conditional transformation |
modify_at() |
Names or positions | Specific columns |
modify_in() |
Nested path | Deep list updates |
Key points:
modify()returns the same type as its input — that’s the whole point- Use
modify_if()to transform only matching columns (e.g., numeric ones) - Use
modify_at()with names, positions, or tidyselect helpers - Reach for
mutate()instead when you need cross-column logic