How to use separate_wider_delim() in R

tidyr
separate_wider_delim()
Learn how to use separate_wider_delim() in R with practical examples. Step-by-step guide with code you can copy and run immediately.
Published

February 21, 2026

Introduction

The separate_wider_delim() function from the tidyr package splits character columns into multiple columns based on a delimiter. This function is particularly useful when you have data stored in a single column that contains multiple pieces of information separated by characters like commas, semicolons, or pipes.

Getting Started

library(tidyverse)
library(palmerpenguins)

Example 1: Basic Usage

The Problem

Imagine you have penguin data where species and island information are combined into a single column with a hyphen separator. We need to split this combined information into separate columns for better analysis.

Step 1: Create sample data with combined information

Let’s create a dataset that mimics this common data problem.

penguin_combined <- penguins |>
  filter(!is.na(species), !is.na(island)) |>
  slice_head(n = 6) |>
  mutate(species_island = paste(species, island, sep = "-")) |>
  select(species_island, body_mass_g)

This creates a new column where species and island are combined with a hyphen delimiter.

Step 2: Apply separate_wider_delim()

Now we’ll separate the combined column back into individual components.

penguin_separated <- penguin_combined |>
  separate_wider_delim(
    cols = species_island,
    delim = "-",
    names = c("species", "island")
  )

The function successfully splits the species_island column into two new columns using the hyphen as the delimiter.

Step 3: View the results

Let’s examine our separated data to confirm it worked correctly.

print(penguin_separated)

The data now has clean, separate columns for species and island, making it ready for analysis.

Example 2: Practical Application

The Problem

You receive a dataset where car information is stored in a single column with pipe separators, containing make, model, and year. This format makes filtering and analysis difficult, so we need to separate it into individual columns.

Step 1: Create realistic automotive data

We’ll simulate this scenario using the mtcars dataset with combined car information.

car_data <- mtcars |>
  rownames_to_column("car_name") |>
  slice_head(n = 8) |>
  mutate(
    make_model_cyl = paste(
      word(car_name, 1), 
      word(car_name, 2), 
      cyl, 
      sep = "|"
    )
  ) |>
  select(make_model_cyl, mpg, hp)

This creates a realistic scenario where car make, model, and cylinder count are combined with pipe delimiters.

Step 2: Separate the combined automotive information

Now we’ll split the combined column into its component parts.

car_separated <- car_data |>
  separate_wider_delim(
    cols = make_model_cyl,
    delim = "|",
    names = c("make", "model", "cylinders")
  )

The pipe delimiter successfully separates our automotive data into three distinct columns.

Step 3: Convert data types and verify results

Let’s clean up the data types and examine our results.

car_final <- car_separated |>
  mutate(cylinders = as.numeric(cylinders)) |>
  glimpse()

Our automotive data is now properly structured with separate columns for make, model, and cylinders, with appropriate data types.

Step 4: Handle potential issues with too many separators

Sometimes data has inconsistent delimiters. Let’s see how to handle extra separators.

car_data_messy <- car_data |>
  mutate(make_model_cyl = str_replace(make_model_cyl, "\\|", "||")) |>
  separate_wider_delim(
    cols = make_model_cyl,
    delim = "|",
    names = c("make", "model", "cylinders", "extra"),
    too_few = "align_start"
  )

The too_few parameter helps handle rows with insufficient delimiters by aligning data to the start and filling missing values with NA.

Summary

  • separate_wider_delim() splits character columns into multiple columns using specified delimiters
  • The cols parameter specifies which column to separate, while delim sets the separator character
  • The names parameter defines the new column names for the separated data
  • Use too_few and too_many parameters to handle inconsistent numbers of delimiters in your data
  • This function is essential for cleaning imported data where multiple values are stored in single columns