How to split a column into multiple columns in R

tidyr

tidyr separate

Learn split a column into multiple columns in r with clear examples and explanations.

Published

March 26, 2026

Introduction

The separate_wider_delim() function from tidyr helps you split a single column containing delimited text into multiple columns. This is particularly useful when working with data where multiple pieces of information are stored together in one field, separated by characters like hyphens, underscores, or commas. You’ll commonly encounter this when cleaning imported data or working with concatenated identifiers.

Loading Required Packages

We’ll use the tidyverse package, which includes tidyr for data manipulation functions.

library(tidyverse)

Basic Column Separation

Let’s start with a simple example where we have full names separated by hyphens that we want to split into first and last names.

df1 <- tibble(
  id = 1:3, 
  Name = c("John-Nash", "Hadley-Wickham", "Richard-Feynman")
)
df1

Now we’ll separate the Name column into two new columns using the hyphen as our delimiter:

df1 |>
  separate_wider_delim(
    Name, 
    delim = "-", 
    names = c("First_Name", "Last_Name")
  )

The original Name column is replaced with two new columns: First_Name and Last_Name. Each person’s name is now properly separated into individual components.

Separating Into Multiple Columns

When your data contains more than two pieces of information, you can specify additional column names to capture all the separated values.

df2 <- tibble(
  id = 1:3, 
  Name = c("John-Nash-Math", "Hadley-Wickham-Stat", "Richard-Feynman-Physics")
)
df2

Here we’ll create three columns from our delimited data:

df2 |>
  separate_wider_delim(
    Name, 
    delim = "-", 
    names = c("First_Name", "Last_Name", "Field")
  )

This creates three separate columns, making it easy to analyze or filter by any of these individual components.

Using Automatic Column Naming

When you don’t want to specify individual column names, you can use names_sep to automatically generate column names with a consistent pattern.

df2 |>
  separate_wider_delim(
    Name, 
    delim = "-", 
    names_sep = "_"
  )

This approach creates columns named Name_1, Name_2, Name_3, etc. It’s particularly useful when you have many columns to create or when the exact number of resulting columns might vary.

Chaining Multiple Separations

Sometimes you need to perform multiple separation operations when your data has different delimiters for different information levels.

df3 <- tibble(
  id = 1:3, 
  Name = c("John-Nash_Math", "Hadley-Wickham_Stat", "Richard-Feynman_Physics")
)

We can chain multiple separate_wider_delim() operations to handle mixed delimiters:

df3 |>
  separate_wider_delim(
    Name, 
    delim = "-", 
    names = c("First_Name", "Temp_Name")
  ) |>
  separate_wider_delim(
    Temp_Name, 
    delim = "_", 
    names = c("Last_Name", "Field")
  )

This approach first separates on the hyphen, then separates the remaining combined text on the underscore. The result is a clean dataset with each piece of information in its own column.

Summary

The separate_wider_delim() function is an essential tool for data cleaning and preparation. It allows you to split delimited text into organized columns, whether you need to separate names, parse identifiers, or extract multiple data points from concatenated strings. Remember to specify your delimiter carefully and provide meaningful column names to make your data analysis more intuitive and maintainable.

--- title: "How to split a column into multiple columns in R" description: "Learn split a column into multiple columns in r with clear examples and explanations." date: 2026-03-26 categories: ['tidyr', 'tidyr separate'] format: html: code-fold: false code-tools: true --- ## Introduction The `separate_wider_delim()` function from tidyr helps you split a single column containing delimited text into multiple columns. This is particularly useful when working with data where multiple pieces of information are stored together in one field, separated by characters like hyphens, underscores, or commas. You'll commonly encounter this when cleaning imported data or working with concatenated identifiers. ## Loading Required Packages We'll use the tidyverse package, which includes tidyr for data manipulation functions. ```r library(tidyverse) ``` ## Basic Column Separation Let's start with a simple example where we have full names separated by hyphens that we want to split into first and last names. ```r df1 <- tibble( id = 1:3, Name = c("John-Nash", "Hadley-Wickham", "Richard-Feynman") ) df1 ``` Now we'll separate the Name column into two new columns using the hyphen as our delimiter: ```r df1 |> separate_wider_delim( Name, delim = "-", names = c("First_Name", "Last_Name") ) ``` The original Name column is replaced with two new columns: First_Name and Last_Name. Each person's name is now properly separated into individual components. ## Separating Into Multiple Columns When your data contains more than two pieces of information, you can specify additional column names to capture all the separated values. ```r df2 <- tibble( id = 1:3, Name = c("John-Nash-Math", "Hadley-Wickham-Stat", "Richard-Feynman-Physics") ) df2 ``` Here we'll create three columns from our delimited data: ```r df2 |> separate_wider_delim( Name, delim = "-", names = c("First_Name", "Last_Name", "Field") ) ``` This creates three separate columns, making it easy to analyze or filter by any of these individual components. ## Using Automatic Column Naming When you don't want to specify individual column names, you can use `names_sep` to automatically generate column names with a consistent pattern. ```r df2 |> separate_wider_delim( Name, delim = "-", names_sep = "_" ) ``` This approach creates columns named Name_1, Name_2, Name_3, etc. It's particularly useful when you have many columns to create or when the exact number of resulting columns might vary. ## Chaining Multiple Separations Sometimes you need to perform multiple separation operations when your data has different delimiters for different information levels. ```r df3 <- tibble( id = 1:3, Name = c("John-Nash_Math", "Hadley-Wickham_Stat", "Richard-Feynman_Physics") ) ``` We can chain multiple `separate_wider_delim()` operations to handle mixed delimiters: ```r df3 |> separate_wider_delim( Name, delim = "-", names = c("First_Name", "Temp_Name") ) |> separate_wider_delim( Temp_Name, delim = "_", names = c("Last_Name", "Field") ) ``` This approach first separates on the hyphen, then separates the remaining combined text on the underscore. The result is a clean dataset with each piece of information in its own column. ## Summary The `separate_wider_delim()` function is an essential tool for data cleaning and preparation. It allows you to split delimited text into organized columns, whether you need to separate names, parse identifiers, or extract multiple data points from concatenated strings. Remember to specify your delimiter carefully and provide meaningful column names to make your data analysis more intuitive and maintainable.