How to convert a matrix to a tidy table
In this tutorial, we will learn how to convert a data matrix to tidy long table. We will use tidyr’s pivot_longer() function to reshape the matrix into a tidy table.
First, let us start with loading the tidyverse suit of packages.
library(tidyverse)Create a simulated data matrix
Let us create a data matrix using rnorm() function generating random numbers from normal distribution.
set.seed(42)
# Simulate data matrix with 10 rows and 6 columns
mat
1 1.37 1.30 -0.307 0.455 0.206 0.322
2 -0.565 2.29 -1.78 0.705 -0.361 -0.784
3 0.363 -1.39 -0.172 1.04 0.758 1.58
4 0.633 -0.279 1.21 -0.609 -0.727 0.643
5 0.404 -0.133 1.90 0.505 -1.37 0.0898
6 -0.106 0.636 -0.430 -1.72 0.433 0.277
7 1.51 -0.284 -0.257 -0.784 -0.811 0.679
8 -0.0947 -2.66 -1.76 -0.851 1.44 0.0898
9 2.02 -2.44 0.460 -2.41 -0.431 -2.99
10 -0.0627 1.32 -0.640 0.0361 0.656 0.285And then add a column containing row number as a factor variable. We use dplyr’s row_number() function.
as_tibble(mat) %>%
mutate(row_id=factor(row_number()))
# A tibble: 10 × 7
C1 C2 C3 C4 C5 C6 row_id
1 1.37 1.30 -0.307 0.455 0.206 0.322 1
2 -0.565 2.29 -1.78 0.705 -0.361 -0.784 2
3 0.363 -1.39 -0.172 1.04 0.758 1.58 3
4 0.633 -0.279 1.21 -0.609 -0.727 0.643 4
5 0.404 -0.133 1.90 0.505 -1.37 0.0898 5
6 -0.106 0.636 -0.430 -1.72 0.433 0.277 6
7 1.51 -0.284 -0.257 -0.784 -0.811 0.679 7
8 -0.0947 -2.66 -1.76 -0.851 1.44 0.0898 8
9 2.02 -2.44 0.460 -2.41 -0.431 -2.99 9
10 -0.0627 1.32 -0.640 0.0361 0.656 0.285 10mutate() function creates a new variable at the end of the dataframe, as last column. Here we use relocate() function to move the row ID column to the front.
mat_df %
mutate(row_id=factor(row_number())) %>%
relocate(row_id)
mat_df
# A tibble: 10 × 7
row_id C1 C2 C3 C4 C5 C6
1 1 1.37 1.30 -0.307 0.455 0.206 0.322
2 2 -0.565 2.29 -1.78 0.705 -0.361 -0.784
3 3 0.363 -1.39 -0.172 1.04 0.758 1.58
4 4 0.633 -0.279 1.21 -0.609 -0.727 0.643
5 5 0.404 -0.133 1.90 0.505 -1.37 0.0898
6 6 -0.106 0.636 -0.430 -1.72 0.433 0.277
7 7 1.51 -0.284 -0.257 -0.784 -0.811 0.679
8 8 -0.0947 -2.66 -1.76 -0.851 1.44 0.0898
9 9 2.02 -2.44 0.460 -2.41 -0.431 -2.99
10 10 -0.0627 1.32 -0.640 0.0361 0.656 0.285pivot_longer(): Reshape the data matrix to tidy dataframe
Now we have the data as a dataframe and ready to reshape using pivot_longer() function to a tidy long format. Here we specify where names should go to and values of the matrix should go to.
mat_df %>%
pivot_longer(-row_id,
names_to="sample_id",
values_to="vals")
# A tibble: 60 × 3
row_id sample_id vals
1 1 C1 1.37
2 1 C2 1.30
3 1 C3 -0.307
4 1 C4 0.455
5 1 C5 0.206
6 1 C6 0.322
7 2 C1 -0.565
8 2 C2 2.29
9 2 C3 -1.78
10 2 C4 0.705
# … with 50 more rowsWe have now converted to a data matrix into a long tidy dataframe.