How to use nest() in R

tidyr
nest()
Learn how to use nest() in R with practical examples. Step-by-step guide with code you can copy and run immediately.
Published

February 21, 2026

Introduction

The tidyr::nest() function creates nested data frames by grouping rows and storing them as list-columns within a tibble. This powerful function transforms your data structure by taking specified columns and nesting them into tibbles stored within cells of a new column. It’s particularly useful when you want to perform operations on subsets of your data, create models for different groups, or organize complex hierarchical data structures.

You would use nest() when working with grouped analyses, building multiple models on subsets of data, or when you need to apply functions to different categories within your dataset while maintaining a tidy structure.

Getting Started

library(tidyverse)
library(palmerpenguins)

Example 1: Basic Usage

Let’s start with a simple example using the Palmer penguins dataset, nesting data by species:

penguins_nested <- penguins |>
  drop_na() |>
  nest(.by = species)

penguins_nested

This creates a tibble with one row per penguin species, where each row contains a nested tibble with all the remaining columns (island, bill dimensions, body mass, sex, and year) for that species. The .by argument specifies the grouping variable, and all other columns are automatically nested into a list-column called data.

You can also specify which columns to nest explicitly:

penguins_custom_nest <- penguins |>
  drop_na() |>
  nest(measurements = c(bill_length_mm, bill_depth_mm, 
                       flipper_length_mm, body_mass_g),
       .by = c(species, island))

This approach gives you more control by creating a nested column called “measurements” containing only the specified numeric variables, grouped by both species and island.

Example 2: Practical Application

Here’s a real-world scenario where you want to build separate linear models for each penguin species to predict body mass from flipper length:

penguin_models <- penguins |>
  drop_na() |>
  nest(.by = species) |>
  mutate(
    model = map(data, ~ lm(body_mass_g ~ flipper_length_mm, data = .x)),
    model_summary = map(model, broom::glance),
    predictions = map2(model, data, ~ broom::augment(.x, newdata = .y))
  ) |>
  select(species, model_summary, predictions) |>
  unnest(model_summary)

This workflow demonstrates the power of nested data frames for complex analyses. We nest the data by species, then use purrr::map() functions to apply modeling operations to each nested dataset. The map() function applies the linear model to each species’ data, broom::glance() extracts model statistics, and broom::augment() generates predictions. Finally, we unnest the summary statistics while keeping predictions nested for potential further analysis.

Another practical application involves calculating summary statistics by group:

penguin_summaries <- penguins |>
  drop_na() |>
  nest(.by = c(species, island)) |>
  mutate(
    n_penguins = map_int(data, nrow),
    avg_measurements = map(data, ~ .x |>
                          summarise(
                            avg_bill_length = mean(bill_length_mm),
                            avg_bill_depth = mean(bill_depth_mm),
                            avg_flipper_length = mean(flipper_length_mm),
                            avg_body_mass = mean(body_mass_g)
                          ))
  ) |>
  unnest(avg_measurements)

This creates summary statistics for each species-island combination while maintaining the nested structure for the raw data.

Summary

  • nest() transforms data into a hierarchical structure with list-columns, making it ideal for grouped analyses and applying functions to subsets of data while maintaining organization
  • The function works seamlessly with purrr::map() functions to perform complex operations on each nested group, such as modeling, summarizing, or custom transformations
  • Use nest() when you need to maintain data relationships while performing group-wise operations, especially when combined with unnest() to flatten results back into a standard tibble format