tidyverse all_of(): select columns from a vector
In this tutorial, we will learn about how to select multiple columns from a dataframe by using the column names as a vector at once.
tidyverse’ tidyselect package has numerous options for selecting columns from a datafame. all_of() is one of the functions in tidyselect that helps us selecting multiple columns using a character vector.
Let us see an example of why we should use all_of() to select columns from a vector. First we will load tidyverse the meta R package.
library(tidyverse)starwars %>% head()
# A tibble: 6 × 14
name height mass hair_color skin_color eye_color birth_year sex gender
1 Luke Sky… 172 77 blond fair blue 19 male mascu…
2 C-3PO 167 75 gold yellow 112 none mascu…
3 R2-D2 96 32 white, bl… red 33 none mascu…
4 Darth Va… 202 136 none white yellow 41.9 male mascu…
5 Leia Org… 150 49 brown light brown 19 fema… femin…
6 Owen Lars 178 120 brown, gr… light blue 52 male mascu…
# … with 5 more variables: homeworld , species , films ,
# vehicles , starships Thee names of the columns that we want to select is in a vector.
column_name_vector % select(column_name_vector)The code does get executed and give a result that may not bee correct. And we also get the following warning .
Note: Using an external vector in selections is ambiguous.
ℹ Use `all_of(column_name_vector)` instead of `column_name_vector` to silence this message.
ℹ See .
This message is displayed once per sessionIn our example it give what we needed.
# A tibble: 87 × 4
name height skin_color gender
1 Luke Skywalker 172 fair masculine
2 C-3PO 167 gold masculine
3 R2-D2 96 white, blue masculine
4 Darth Vader 202 white masculine
5 Leia Organa 150 light feminine
6 Owen Lars 178 light masculine
7 Beru Whitesun lars 165 light feminine
8 R5-D4 97 white, red masculine
9 Biggs Darklighter 183 light masculine
10 Obi-Wan Kenobi 182 fair masculine
# … with 77 more rowstidyselect’s all_of(): to select columns of from a vector
However, the right approach is to use all_of(vector_name) as argument to select() function. Now we will get the result.
starwars %>%
select(all_of(column_name_vector))
# A tibble: 87 × 4
name height skin_color gender
1 Luke Skywalker 172 fair masculine
2 C-3PO 167 gold masculine
3 R2-D2 96 white, blue masculine
4 Darth Vader 202 white masculine
5 Leia Organa 150 light feminine
6 Owen Lars 178 light masculine
7 Beru Whitesun lars 165 light feminine
8 R5-D4 97 white, red masculine
9 Biggs Darklighter 183 light masculine
10 Obi-Wan Kenobi 182 fair masculine
# … with 77 more rowsNote that all_of() function is
for strict selection. If any of the variables in the character vector is missing, an error is thrown.
# a vector containing a name that is not present in the dataframe
column_name_vector %
select(all_of(column_name_vector))Since the column actor is not present in the dataframe, all_of() will throw the following error and quit.
Quitting from lines 37-41 (select_columns_from_vectors.qmd)
Error in `select()`:
! Can't subset columns that don't exist.
✖ Column `actor` doesn't exist.
Backtrace:In the situations, where you are not interested in getting all the columns in the vector, but any of the columns in the vector, we need to use any_of() function instead of all_of().