How to Randomly Replace Values in a Matrix to NAs
In this tutorial, we will see how to randomly replace values in a matrix to NAs, missing values.
We will first create some data matrix by simulation. Here we create a matrix with 20 rows and 5 columns.
data_mat <- matrix(round(rnorm(mean=5, sd=4, 100), 1),
ncol=5)
dim(data_mat)
## [1] 20 5The data matrix is complete without any missing values.
data_mat
## [,1] [,2] [,3] [,4] [,5]
## [1,] 1.2 3.8 0.4 2.1 2.3
## [2,] 0.0 1.6 5.1 1.9 6.9
## [3,] 6.5 -0.4 8.8 6.2 12.0
## [4,] 3.0 5.0 5.1 1.6 2.8
## [5,] 4.8 8.4 7.6 8.9 3.6
## [6,] 0.1 4.6 4.4 7.0 7.4
## [7,] 5.6 5.7 4.3 0.8 1.2
## [8,] 3.6 9.6 2.0 0.5 3.2
## [9,] 6.0 5.9 -0.1 6.5 4.3
## [10,] 2.7 5.3 4.6 3.5 2.0
## [11,] 7.0 7.0 1.5 1.9 2.0
## [12,] 2.7 0.7 4.1 3.7 1.6
## [13,] 4.7 -5.5 9.4 -3.5 9.5
## [14,] 6.6 4.9 -2.0 -0.7 -0.3
## [15,] 8.3 8.8 5.0 1.0 7.9
## [16,] 6.2 8.0 2.6 2.1 5.7
## [17,] 1.8 6.6 16.6 4.6 3.3
## [18,] 9.4 8.8 11.6 12.3 9.1
## [19,] 2.2 5.6 -1.3 9.9 3.4
## [20,] 4.6 4.2 1.7 3.2 4.3To randomly introduce NAs, first we randomly select rows to introduce missing values using sample() function.
n_NAs <- 15
na_ind_rows <- sample(1:nrow(data_mat), n_NAs)
na_ind_rows
## [1] 16 13 7 14 6 2 17 10 3 15 11 9 18 5 19And then, we randomly select select columns to introduce NAs.
na_ind_cols <- sample(1:ncol(data_mat), n_NAs, replace=TRUE)
na_ind_cols
## [1] 1 2 1 5 5 1 1 5 5 3 3 5 4 3 4By combining the row index for NAs and column index for NAs, we have the exact index location where we need to replace its value to NAs. For example, the first row tells us the 16th row and the first column should be an NA, and so on.
na_inds <- cbind(na_ind_rows, na_ind_cols)
na_inds
## na_ind_rows na_ind_cols
## [1,] 16 1
## [2,] 13 2
## [3,] 7 1
## [4,] 14 5
## [5,] 6 5
## [6,] 2 1
## [7,] 17 1
## [8,] 10 5
## [9,] 3 5
## [10,] 15 3
## [11,] 11 3
## [12,] 9 5
## [13,] 18 4
## [14,] 5 3
## [15,] 19 4Now we can use the index to replace their valies to NAs
data_mat[na_inds]<- NAWe can check the new data matrix with random NAs
data_mat
## [,1] [,2] [,3] [,4] [,5]
## [1,] 1.2 3.8 0.4 2.1 2.3
## [2,] NA 1.6 5.1 1.9 6.9
## [3,] 6.5 -0.4 8.8 6.2 NA
## [4,] 3.0 5.0 5.1 1.6 2.8
## [5,] 4.8 8.4 NA 8.9 3.6
## [6,] 0.1 4.6 4.4 7.0 NA
## [7,] NA 5.7 4.3 0.8 1.2
## [8,] 3.6 9.6 2.0 0.5 3.2
## [9,] 6.0 5.9 -0.1 6.5 NA
## [10,] 2.7 5.3 4.6 3.5 NA
## [11,] 7.0 7.0 NA 1.9 2.0
## [12,] 2.7 0.7 4.1 3.7 1.6
## [13,] 4.7 NA 9.4 -3.5 9.5
## [14,] 6.6 4.9 -2.0 -0.7 NA
## [15,] 8.3 8.8 NA 1.0 7.9
## [16,] NA 8.0 2.6 2.1 5.7
## [17,] NA 6.6 16.6 4.6 3.3
## [18,] 9.4 8.8 11.6 NA 9.1
## [19,] 2.2 5.6 -1.3 NA 3.4
## [20,] 4.6 4.2 1.7 3.2 4.3