How to create a nested dataframe with lists
In this tutorial, we will learn how to create a nested dataframe using nest() function in tidyverse. A nested dataframe is a dataframe where one or more columns are list columns. In a simple dataframe, columns are simple/atomic vectors. However, column can contain other data structures like list, or dataframe. Such columns are called list columns.
library(tidyverse)
packageVersion("dplyr")
[1] '1.1.4'Let us create a dataframe with group id and group members as two columns.
data
group_by(group_id) |>
summarize(members = list(member))Our nested dataframe looks like this.
nested
# A tibble: 3 × 2
group_id members
1 A
2 B
3 C Here is way to access the values in the list columns
nested$members[[1]]
[1] "John" "Paul" "Stella"nested$members[[2]]
[1] "Paul" "Jake"We can unnest the nested dataframe and get back the original dataframe using unnest() function.
nested |> unnest()
Warning: `cols` is now required when using `unnest()`.
ℹ Please use `cols = c(members)`.
# A tibble: 7 × 2
group_id members
1 A John
2 A Paul
3 A Stella
4 B Paul
5 B Jake
6 C John
7 C MaryHere we clearly specify how to unnest the nested dataframe.
nested |> unnest(members)
# A tibble: 7 × 2
group_id members
1 A John
2 A Paul
3 A Stella
4 B Paul
5 B Jake
6 C John
7 C MaryNote that we have not used nest() function to create nested dataframe. With tidyr’s nest() function we can create list columns with tibbles easily.