How to Change Column Name with a variable name

dplyr rename()
Learn how to change column name with a variable name with this comprehensive R tutorial. Includes practical examples and code snippets.
Published

July 15, 2023

Introduction

When working with dplyr, you sometimes need to change column names dynamically using variable names rather than hard-coding them. This is particularly useful when writing functions or when column names are stored in variables. We’ll explore several methods to accomplish this using modern dplyr syntax.

Getting Started

library(tidyverse)
library(palmerpenguins)

Example 1: Basic Usage with rename()

The Problem

You have a column name stored in a variable and want to rename a column in your dataset. Standard rename() syntax doesn’t work directly with variables.

Step 1: Create sample data and variables

Let’s set up our data and define the column names we want to work with.

# Load sample data
data <- penguins |> 
  select(species, bill_length_mm, bill_depth_mm) |> 
  slice_head(n = 5)

# Define our variable names
old_name <- "bill_length_mm"
new_name <- "bill_length"

This creates a subset of penguin data with the columns we need for demonstration.

Step 2: Use rename() with !! operator

The !! (bang-bang) operator allows us to unquote variables in dplyr functions.

# Rename using variables with !! operator
renamed_data <- data |> 
  rename(!!new_name := !!sym(old_name))

print(renamed_data)

The sym() function converts the string to a symbol, and !! unquotes both the old and new names.

Step 3: Verify the column name change

Let’s check that our renaming worked correctly.

# Check column names before and after
cat("Original columns:", names(data), "\n")
cat("New columns:", names(renamed_data), "\n")

You can see that “bill_length_mm” has been successfully renamed to “bill_length”.

Example 2: Practical Application with Multiple Renames

The Problem

In real-world scenarios, you often need to rename multiple columns programmatically, perhaps when cleaning data or preparing it for analysis. Let’s create a function that renames columns based on a lookup table.

Step 1: Create a renaming function

We’ll build a reusable function that accepts a data frame and a named vector of column mappings.

# Create function for multiple renames
rename_columns <- function(data, name_mapping) {
  for (i in seq_along(name_mapping)) {
    old_col <- names(name_mapping)[i]
    new_col <- name_mapping[[i]]
    data <- data |> rename(!!new_col := !!sym(old_col))
  }
  return(data)
}

This function iterates through each name mapping and applies the rename operation.

Step 2: Define column mappings

Let’s create a named vector that maps old column names to new ones.

# Define our column name mappings
column_mappings <- c(
  "bill_length_mm" = "beak_length",
  "bill_depth_mm" = "beak_depth",
  "species" = "penguin_type"
)

print(column_mappings)

Each element maps an old name (the name) to a new name (the value).

Step 3: Apply the function to our data

Now we can use our function to rename multiple columns at once.

# Apply multiple renames
final_data <- rename_columns(penguins |> 
                            select(species, bill_length_mm, bill_depth_mm) |> 
                            slice_head(n = 3), 
                            column_mappings)

print(final_data)

All three columns have been renamed according to our mapping in a single function call.

Step 4: Alternative approach with rename_with()

For pattern-based renaming, rename_with() provides a more elegant solution.

# Pattern-based renaming
pattern_renamed <- penguins |> 
  select(contains("bill")) |> 
  rename_with(~str_replace(.x, "bill_", "beak_"), 
              contains("bill")) |> 
  slice_head(n = 3)

print(pattern_renamed)

This approach uses a function to transform column names that match a specific pattern.

Summary

  • Use !! and sym() to work with column names stored in variables within dplyr functions
  • The := operator is essential for dynamic assignment in rename() when using variables
  • Create reusable functions for complex renaming operations to avoid code repetition
  • rename_with() is ideal for pattern-based transformations across multiple columns
  • These techniques are particularly valuable when building functions or working with programmatically generated column names