How to add Prefix/Suffix to Column Names of a dataframe in R

dplyr
dplyr rename_with()
Learn how to perform add prefix/suffix to column names of a dataframe in R. Step-by-step statistical tutorial with examples.
Published

January 11, 2023

Introduction

Adding prefixes or suffixes to column names is a common data manipulation task when preparing datasets for analysis or when combining multiple dataframes. This technique helps create descriptive column names, avoid naming conflicts, and organize your data more effectively.

Getting Started

library(tidyverse)
library(palmerpenguins)

Example 1: Basic Usage

The Problem

We need to add prefixes and suffixes to column names in the penguins dataset. This is useful when you want to make column names more descriptive or prepare data for merging.

Step 1: Add Prefix to All Columns

Let’s start by adding a prefix to all column names in our dataset.

# Add "penguin_" prefix to all columns
penguins_prefixed <- penguins |>
  rename_with(~ paste0("penguin_", .))

# View the new column names
names(penguins_prefixed)

The rename_with() function applies the transformation to all columns, creating names like “penguin_species” and “penguin_bill_length_mm”.

Step 2: Add Suffix to All Columns

Now we’ll add a suffix to all column names using a similar approach.

# Add "_measured" suffix to all columns
penguins_suffixed <- penguins |>
  rename_with(~ paste0(., "_measured"))

# Check the first few column names
head(names(penguins_suffixed), 4)

This creates column names like “species_measured” and “bill_length_mm_measured” by appending the suffix to each original name.

Step 3: Target Specific Columns

We can also apply prefixes or suffixes to only selected columns using column selection helpers.

# Add "metric_" prefix only to numeric columns
penguins_selective <- penguins |>
  rename_with(~ paste0("metric_", .), where(is.numeric))

# View all column names to see the change
names(penguins_selective)

Only the numeric columns now have the “metric_” prefix, while character columns like “species” remain unchanged.

Example 2: Practical Application

The Problem

Imagine you’re combining survey data from different years and need to distinguish between measurements. You have penguin data that needs year-specific prefixes for numeric measurements while keeping identifying columns unchanged.

Step 1: Create Sample Multi-Year Data

First, let’s simulate having data from different survey years.

# Create 2020 survey data
penguins_2020 <- penguins |>
  slice_head(n = 100) |>
  select(species, island, bill_length_mm, bill_depth_mm)

head(penguins_2020, 3)

We’ve created a subset representing 2020 survey data with key measurement columns.

Step 2: Add Year-Specific Prefixes to Measurements

Now we’ll add year prefixes to only the measurement columns while preserving the identifying columns.

# Add "y2020_" prefix to measurement columns only
penguins_2020_labeled <- penguins_2020 |>
  rename_with(~ paste0("y2020_", .), 
              c(bill_length_mm, bill_depth_mm))

names(penguins_2020_labeled)

The measurement columns now have year-specific prefixes while “species” and “island” remain unchanged for easy joining.

Step 3: Combine with Another Year’s Data

Let’s create data for another year and combine them to see the practical benefit.

# Create and label 2021 data
penguins_2021 <- penguins |>
  slice_tail(n = 100) |>
  select(species, island, bill_length_mm, bill_depth_mm) |>
  rename_with(~ paste0("y2021_", .), 
              c(bill_length_mm, bill_depth_mm))

Now we have clearly labeled measurement columns that won’t conflict when joining datasets.

Step 4: Demonstrate the Clean Join

Finally, let’s join the datasets to show how the prefixes prevent naming conflicts.

# Join the datasets by species and island
combined_data <- penguins_2020_labeled |>
  full_join(penguins_2021, by = c("species", "island"))

# Check the structure
glimpse(combined_data)

The resulting dataset clearly distinguishes between measurements from different years, making analysis much cleaner.

Summary

  • Use rename_with() with paste0() to add prefixes (paste0("prefix_", .)) or suffixes (paste0(., "_suffix"))
  • Apply transformations to all columns or use where() and column selection helpers for targeted renaming
  • Column name modifications are essential when preparing data for joins or merges
  • Year-specific or source-specific prefixes help maintain data lineage and prevent naming conflicts
  • The pipe operator |> makes these operations clean and readable in data processing workflows