expand_grid(): Create all possible combinations of variables

tidyr
tidyr expand_grid()
Published

August 26, 2024

In this tutorial, we will learn how to create all possible combinations of two variables using tidyr’s expand_grid() function. For example, if we have two variables of interest and want to create a dataframe with all possible combinations of the values of the two variables, we can use expand_grid() function.

let us get started by loading tidyverse.

library(tidyverse)

Let us say we have two variables, each with 5 elements

var1 = letters[1:5]
var1
## [1] "a" "b" "c" "d" "e"
var2 = LETTERS[1:5]
var2
## [1] "A" "B" "C" "D" "E"

tidyr’s expand_grid() Example

And to create all possible combinations of these two variables, 25 combinations in total, we can use expand_grid() function in tidyr

combination_df  
##  1 a     A    
##  2 a     B    
##  3 a     C    
##  4 a     D    
##  5 a     E    
##  6 b     A    
##  7 b     B    
##  8 b     C    
##  9 b     D    
## 10 b     E    
## # ℹ 15 more rows

base R’s expand.grid() Example

tidyr’s expand_grid() function is inspired by base R’s expand.grid() function and it can create a dataframe with all possible combinations of factor variables as given in the example above.

combination_df    
## 1  2021 Q1     
## 2  2021 Q2     
## 3  2021 Q3     
## 4  2021 Q4

With tidyr’s expand_grid(), we can create the combinations of dataframe and the factor/character variable.

expand_grid(df, companies = c("GOOG", "MSFT", "NVDA"))

## # A tibble: 12 × 3
##     year quarter companies
##            
##  1  2021 Q1      GOOG     
##  2  2021 Q1      MSFT     
##  3  2021 Q1      NVDA     
##  4  2021 Q2      GOOG     
##  5  2021 Q2      MSFT     
##  6  2021 Q2      NVDA     
##  7  2021 Q3      GOOG     
##  8  2021 Q3      MSFT     
##  9  2021 Q3      NVDA     
## 10  2021 Q4      GOOG     
## 11  2021 Q4      MSFT     
## 12  2021 Q4      NVDA