dplyr rename(): How to rename column in r with dplyr

Master dplyr rename() to how to rename column in r with dplyr. Complete R tutorial with examples using real datasets.
Published

December 3, 2022

Introduction

The rename() function from dplyr is the standard way to change column names in R data frames. It’s particularly useful when you need to make column names more descriptive, fix naming inconsistencies, or prepare data for analysis with cleaner variable names.

Getting Started

library(tidyverse)
library(palmerpenguins)

Example 1: Basic Usage

The Problem

Let’s say we want to rename some columns in the penguins dataset to make them more readable. The current names like “bill_length_mm” could be simplified to “bill_length” for easier reference.

Step 1: View the original column names

First, let’s examine the current structure of our data.

# Check current column names
names(penguins)
head(penguins, 3)

This shows us the original column names including “bill_length_mm”, “bill_depth_mm”, and “flipper_length_mm”.

Step 2: Rename a single column

We’ll start by renaming just one column using the basic rename syntax.

# Rename one column: new_name = old_name
penguins_renamed <- penguins |>
  rename(bill_length = bill_length_mm)

names(penguins_renamed)

The column “bill_length_mm” is now renamed to “bill_length” while all other columns remain unchanged.

Step 3: Rename multiple columns at once

Now let’s rename several columns simultaneously for consistency.

# Rename multiple columns in one operation
penguins_clean <- penguins |>
  rename(
    bill_length = bill_length_mm,
    bill_depth = bill_depth_mm,
    flipper_length = flipper_length_mm
  )

All three measurement columns now have cleaner names without the “_mm” suffix.

Example 2: Practical Application

The Problem

You’ve received a dataset from a colleague with poorly named columns that contain spaces, special characters, or unclear abbreviations. You need to standardize these names before beginning your analysis to ensure your code is readable and follows naming conventions.

Step 1: Create a dataset with problematic names

Let’s simulate a real-world scenario with messy column names.

# Create sample data with problematic names
messy_data <- data.frame(
  `Customer ID` = 1:5,
  `First Name` = c("John", "Jane", "Bob", "Alice", "Tom"),
  `$Revenue` = c(1000, 1500, 800, 2000, 1200),
  `Date.Purchased` = Sys.Date() + 1:5
)

This dataset has spaces, special characters, and inconsistent naming patterns that need cleaning.

Step 2: Rename columns to follow best practices

We’ll rename these columns to use snake_case and remove special characters.

# Clean up all problematic column names
clean_data <- messy_data |>
  rename(
    customer_id = Customer.ID,
    first_name = First.Name,
    revenue = X.Revenue,
    purchase_date = Date.Purchased
  )

Now all column names follow consistent snake_case convention and are easier to work with in analysis.

Step 3: Verify the changes and use in analysis

Let’s confirm our changes worked and demonstrate using the renamed columns.

# Check the new structure and use renamed columns
glimpse(clean_data)

# Now we can easily reference columns in analysis
clean_data |>
  filter(revenue > 1000) |>
  select(customer_id, first_name, revenue)

The renamed columns are now much easier to reference in subsequent data manipulation and analysis steps.

Step 4: Combine rename with other dplyr functions

Rename works seamlessly with other dplyr functions in a pipeline.

# Rename and transform in one pipeline
result <- penguins |>
  rename(species_name = species, body_mass = body_mass_g) |>
  filter(!is.na(body_mass)) |>
  group_by(species_name) |>
  summarise(avg_mass = mean(body_mass))

This demonstrates how rename() integrates perfectly with the dplyr workflow for data cleaning and analysis.

Summary

  • Use rename(new_name = old_name) syntax to change column names in dplyr
  • Multiple columns can be renamed in a single rename() call by separating them with commas
  • The rename() function preserves all data and only changes the specified column names
  • Renamed columns work seamlessly with other dplyr functions like filter(), select(), and group_by()
  • Always use consistent naming conventions like snake_case to improve code readability and maintainability