dplyr row_number(): Add unique row number to a dataframe
In this tutorial, we will learn how to add unique row number to each row to a dataframe/tibble. We will use dply’r row_number() function to add unique row number as acolumn to a dataframe using tidyverse first. Then we will also see an example of adding a row number to a dataframe using base R function.
How to add row number to a dataframe in R Let us load tidyverse the suit of R packages from RStudio and this includes dplyr as well. Also just verify the dplyr’s version.
library(tidyverse)
packageVersion("dplyr")
[1] ‘1.0.7’To illustrate how to add unique row number to a dataframe, we will use “faithful” dataset, classic waiting and eruptions data faithful, but with 2d density estimate. faithful is one of the datasets builtin with ggplot2 package.
Let us take a look at the faithfuld dataset using head() function.
faithfuld %>%
head()
## # A tibble: 6 × 3
## eruptions waiting density
##
## 1 1.6 43 0.00322
## 2 1.65 43 0.00384
## 3 1.69 43 0.00444
## 4 1.74 43 0.00498
## 5 1.79 43 0.00542
## 6 1.84 43 0.00574How to add unique row number to a dataframe in R using tidyverse
In order to add unique row number as one of the variables or columns to the dataset, we will use row_number() function with mutate() function from dplyr as shown below. Here we are assigning row number to a variable or column name “row_id”.
faithfuld %>%
mutate(row_id=row_number())
## # A tibble: 5,625 × 4
## eruptions waiting density row_id
##
## 1 1.6 43 0.00322 1
## 2 1.65 43 0.00384 2
## 3 1.69 43 0.00444 3
## 4 1.74 43 0.00498 4
## 5 1.79 43 0.00542 5
## 6 1.84 43 0.00574 6
## 7 1.88 43 0.00592 7
## 8 1.93 43 0.00594 8
## 9 1.98 43 0.00581 9
## 10 2.03 43 0.00554 10
## # … with 5,615 more rowsMove unique row number column to the front with relocate() function in dplyr
Notice that the new column “row_id” is the last column in the dataframe. That is because, by default. mutate() function creates a new column at the end of all existing columns in the dataframe.
To move a column to the first place, first column in the dataframe, we can use relocate() function with the column name of interest. In this example, we are re-locating row_id column from last to the first column in the dataframe.
faithfuld %>%
mutate(row_id=row_number()) %>%
relocate(row_id)
## # A tibble: 5,625 × 4
## row_id eruptions waiting density
##
## 1 1 1.6 43 0.00322
## 2 2 1.65 43 0.00384
## 3 3 1.69 43 0.00444
## 4 4 1.74 43 0.00498
## 5 5 1.79 43 0.00542
## 6 6 1.84 43 0.00574
## 7 7 1.88 43 0.00592
## 8 8 1.93 43 0.00594
## 9 9 1.98 43 0.00581
## 10 10 2.03 43 0.00554
## # … with 5,615 more rowsAdding row number using base R
We can also add row number to the dataframe using base R way. First we create a variable containing row numbers. Here we use seq() function to create a vector containing sequence of numbers. It is of the same size as the number of rows in the dataframe.
# create sequence of number of size equal
# to the number of rows of dataframe
row_id
## 1 1.6 43 0.00322 1
## 2 1.65 43 0.00384 2
## 3 1.69 43 0.00444 3
## 4 1.74 43 0.00498 4
## 5 1.79 43 0.00542 5
## 6 1.84 43 0.00574 6You might also want to check out this post on adding row number by group using row_number() this post on adding row number per each group using row_number().