dplyr row_number(): Add unique row number to a dataframe

dplyr row_number()

Published

January 23, 2022

In this tutorial, we will learn how to add unique row number to each row to a dataframe/tibble. We will use dply’r row_number() function to add unique row number as acolumn to a dataframe using tidyverse first. Then we will also see an example of adding a row number to a dataframe using base R function.

How to add row number to a dataframe in R Let us load tidyverse the suit of R packages from RStudio and this includes dplyr as well. Also just verify the dplyr’s version.

library(tidyverse)
packageVersion("dplyr")

[1] ‘1.0.7’

To illustrate how to add unique row number to a dataframe, we will use “faithful” dataset, classic waiting and eruptions data faithful, but with 2d density estimate. faithful is one of the datasets builtin with ggplot2 package.

Let us take a look at the faithfuld dataset using head() function.

faithfuld %>% 
  head()

## # A tibble: 6 × 3
##   eruptions waiting density
##             
## 1      1.6       43 0.00322
## 2      1.65      43 0.00384
## 3      1.69      43 0.00444
## 4      1.74      43 0.00498
## 5      1.79      43 0.00542
## 6      1.84      43 0.00574

How to add unique row number to a dataframe in R using tidyverse

In order to add unique row number as one of the variables or columns to the dataset, we will use row_number() function with mutate() function from dplyr as shown below. Here we are assigning row number to a variable or column name “row_id”.

faithfuld %>% 
  mutate(row_id=row_number())

## # A tibble: 5,625 × 4
##    eruptions waiting density row_id
##                
##  1      1.6       43 0.00322      1
##  2      1.65      43 0.00384      2
##  3      1.69      43 0.00444      3
##  4      1.74      43 0.00498      4
##  5      1.79      43 0.00542      5
##  6      1.84      43 0.00574      6
##  7      1.88      43 0.00592      7
##  8      1.93      43 0.00594      8
##  9      1.98      43 0.00581      9
## 10      2.03      43 0.00554     10
## # … with 5,615 more rows

Move unique row number column to the front with relocate() function in dplyr

Notice that the new column “row_id” is the last column in the dataframe. That is because, by default. mutate() function creates a new column at the end of all existing columns in the dataframe.

To move a column to the first place, first column in the dataframe, we can use relocate() function with the column name of interest. In this example, we are re-locating row_id column from last to the first column in the dataframe.

faithfuld %>% 
  mutate(row_id=row_number()) %>%
  relocate(row_id)

## # A tibble: 5,625 × 4
##    row_id eruptions waiting density
##                
##  1      1      1.6       43 0.00322
##  2      2      1.65      43 0.00384
##  3      3      1.69      43 0.00444
##  4      4      1.74      43 0.00498
##  5      5      1.79      43 0.00542
##  6      6      1.84      43 0.00574
##  7      7      1.88      43 0.00592
##  8      8      1.93      43 0.00594
##  9      9      1.98      43 0.00581
## 10     10      2.03      43 0.00554
## # … with 5,615 more rows

Adding row number using base R

We can also add row number to the dataframe using base R way. First we create a variable containing row numbers. Here we use seq() function to create a vector containing sequence of numbers. It is of the same size as the number of rows in the dataframe.

# create sequence of number of size equal 
# to the number of rows of dataframe
row_id         
## 1      1.6       43 0.00322      1
## 2      1.65      43 0.00384      2
## 3      1.69      43 0.00444      3
## 4      1.74      43 0.00498      4
## 5      1.79      43 0.00542      5
## 6      1.84      43 0.00574      6

You might also want to check out this post on adding row number by group using row_number() this post on adding row number per each group using row_number().

--- title: "dplyr row_number(): Add unique row number to a dataframe" date: 2022-01-23 categories: ['dplyr row_number()'] format: html: code-fold: false code-tools: true --- In this tutorial, we will learn how to add unique row number to each row to a dataframe/tibble. We will use dply'r row_number() function to add unique row number as acolumn to a dataframe using tidyverse first. Then we will also see an example of adding a row number to a dataframe using base R function. ![How to add row number to a dataframe in R](https://rstats101.com/wp-content/uploads/2022/01/how_to_add_row_number_to_dataframe.png) How to add row number to a dataframe in R Let us load tidyverse the suit of R packages from RStudio and this includes dplyr as well. Also just verify the dplyr's version. ```r library(tidyverse) packageVersion("dplyr") [1] ‘1.0.7’ ``` To illustrate how to add unique row number to a dataframe, we will use "faithful" dataset, classic waiting and eruptions data faithful, but with 2d density estimate. faithful is one of the [datasets builtin with ggplot2](https://rstats101.com/built-in-datasets-in-r) package. Let us take a look at the faithfuld dataset using [head() function](https://rstats101.com/head-in-r-to-view-the-first-elements/). ```r faithfuld %>% head() ## # A tibble: 6 × 3 ## eruptions waiting density ## ## 1 1.6 43 0.00322 ## 2 1.65 43 0.00384 ## 3 1.69 43 0.00444 ## 4 1.74 43 0.00498 ## 5 1.79 43 0.00542 ## 6 1.84 43 0.00574 ``` ### How to add unique row number to a dataframe in R using tidyverse In order to add unique row number as one of the variables or columns to the dataset, we will use row_number() function with mutate() function from dplyr as shown below. Here we are assigning row number to a variable or column name "row_id". ```r faithfuld %>% mutate(row_id=row_number()) ## # A tibble: 5,625 × 4 ## eruptions waiting density row_id ## ## 1 1.6 43 0.00322 1 ## 2 1.65 43 0.00384 2 ## 3 1.69 43 0.00444 3 ## 4 1.74 43 0.00498 4 ## 5 1.79 43 0.00542 5 ## 6 1.84 43 0.00574 6 ## 7 1.88 43 0.00592 7 ## 8 1.93 43 0.00594 8 ## 9 1.98 43 0.00581 9 ## 10 2.03 43 0.00554 10 ## # … with 5,615 more rows ``` ### Move unique row number column to the front with relocate() function in dplyr Notice that the new column "row_id" is the last column in the dataframe. That is because, by default. mutate() function creates a new column at the end of all existing columns in the dataframe. To move a column to the first place, first column in the dataframe, we can use relocate() function with the column name of interest. In this example, we are re-locating row_id column from last to the first column in the dataframe. ```r faithfuld %>% mutate(row_id=row_number()) %>% relocate(row_id) ## # A tibble: 5,625 × 4 ## row_id eruptions waiting density ## ## 1 1 1.6 43 0.00322 ## 2 2 1.65 43 0.00384 ## 3 3 1.69 43 0.00444 ## 4 4 1.74 43 0.00498 ## 5 5 1.79 43 0.00542 ## 6 6 1.84 43 0.00574 ## 7 7 1.88 43 0.00592 ## 8 8 1.93 43 0.00594 ## 9 9 1.98 43 0.00581 ## 10 10 2.03 43 0.00554 ## # … with 5,615 more rows ``` ### Adding row number using base R We can also add row number to the dataframe using base R way. First we create a variable containing row numbers. Here we use **seq()** function to create a vector containing sequence of numbers. It is of the same size as the number of rows in the dataframe. ```r # create sequence of number of size equal # to the number of rows of dataframe row_id ## 1 1.6 43 0.00322 1 ## 2 1.65 43 0.00384 2 ## 3 1.69 43 0.00444 3 ## 4 1.74 43 0.00498 4 ## 5 1.79 43 0.00542 5 ## 6 1.84 43 0.00574 6 ``` You might also want to check out this post on adding row number by group using row_number() [this post on adding row number per each group using row_number().](https://rstats101.com/add-row-number-within-each-group-in-dplyr/)