How to use rename_with() in R
Introduction
The rename_with() function in dplyr allows you to rename multiple columns at once using a function. This is incredibly useful when you need to apply consistent transformations to column names, such as converting to lowercase, adding prefixes, or cleaning messy column names. It’s perfect for data cleaning workflows where column names need standardization.
Getting Started
library(tidyverse)
library(palmerpenguins)Example 1: Basic Usage
The Problem
Column names in datasets often have inconsistent formatting - some uppercase, some lowercase, or mixed case. We need a way to standardize all column names to follow a consistent naming convention.
Step 1: Examine the original data
Let’s look at the penguins dataset and its column names.
# View the structure of penguins data
penguins |>
head(3)
colnames(penguins)This shows us the original column names with mixed case formatting.
Step 2: Convert all names to lowercase
We’ll use rename_with() to apply the tolower() function to all column names.
# Convert all column names to lowercase
penguins_lower <- penguins |>
rename_with(tolower)
colnames(penguins_lower)All column names are now consistently lowercase, making them easier to work with.
Step 3: Apply function to specific columns
We can target specific columns using selection helpers.
# Convert only numeric columns to uppercase
penguins_selective <- penguins |>
rename_with(toupper, where(is.numeric))
colnames(penguins_selective)Only the numeric columns (bill_length_mm, bill_depth_mm, etc.) are now uppercase while others remain unchanged.
Example 2: Practical Application
The Problem
You’ve received a messy dataset where column names have spaces, inconsistent capitalization, and need a uniform prefix for analysis. This is common when importing data from Excel files or external databases where naming conventions weren’t enforced.
Step 1: Create a messy dataset
Let’s simulate this problem using the mtcars dataset.
# Create dataset with messy column names
messy_cars <- mtcars |>
rename("Miles Per Gallon" = mpg,
"CYLINDERS" = cyl,
"horse.power" = hp)
colnames(messy_cars)Now we have inconsistent naming with spaces, uppercase, and mixed formats.
Step 2: Clean column names
We’ll create a custom function to clean these names.
# Function to clean column names
clean_names <- function(x) {
x |>
str_to_lower() |>
str_replace_all(" ", "_") |>
str_replace_all("\\.", "_")
}
cars_clean <- messy_cars |>
rename_with(clean_names)The column names are now consistently formatted with underscores and lowercase letters.
Step 3: Add prefixes to specific columns
Let’s add descriptive prefixes to measurement columns.
# Add "measure_" prefix to specific columns
cars_final <- cars_clean |>
rename_with(~paste0("measure_", .),
c(miles_per_gallon, horse_power, wt))
colnames(cars_final)Selected columns now have the “measure_” prefix, making their purpose clear in analysis.
Step 4: Combine multiple transformations
We can chain multiple rename_with() operations for complex transformations.
# Apply multiple renaming operations
cars_complete <- mtcars |>
rename_with(tolower) |>
rename_with(~str_replace_all(., "_", "."),
contains("a")) |>
rename_with(~paste0("car_", .),
c(mpg, cyl, hp))
head(cars_complete, 2)This demonstrates how to apply different transformations to different column groups in sequence.
Summary
rename_with()applies functions to column names, enabling bulk renaming operations- Use it with built-in functions like
tolower(),toupper()for simple transformations - Combine with selection helpers (
where(),contains(),starts_with()) to target specific columns - Create custom functions for complex naming transformations like cleaning messy imported data
Chain multiple
rename_with()calls to apply different transformations to different column groups