Want to keep learning?

This content is taken from the Partnership for Advanced Computing in Europe (PRACE)'s online course, Managing Big Data with R and Hadoop. Join the course to learn more.

Basic matrix operations in R


In this article we present the basic matrix operations using R with a particular focus on those operations that have the potential for parallelisation using map-reduce.


Note that we assume that you have:

  • started RStudio (by executing rstudio & in the terminal) and
  • you have opened a new R script file.

If you have not, then using ctrl+shift+n you start a new script file that you have to save first to a local folder. Once you type (copy) the R code into the script file, you run it by, e.g., selecting the part of the code you want to run and typing ctrl+enter.


Lets us consider again the following data from Article 3.3. We will work with the following matrix:

M1 <- matrix(rnorm(150,0,1), ncol=3)              

Matrix operations

Identity matrix

An identity matrix is a square matrix with diagonal elements equal to 1 and other elements equal to 0, e.g., we can create an identity matrix with dimensions 3x3 by:


or by the number of columns in M1 (which is 3):


Matrix multiplication

Suppose we want to compute the product of the transpose of M1 by M1. This can be done by:


Covariance matrix

The covariance matrix of M1 we compute directly as:


SS matrix

Likewise, we compute the sum-of-squares and coproducts matrix (SS matrix) of M1 by:

n=nrow(M1)         #number of rows in M1

          X1         X2         X3
X1 42.485852 -6.7437071 -7.3797835
X2 -6.743707 54.7612372 -0.8058014
X3 -7.379783 -0.8058014 40.5334042

If we centralise the data (subtract the centroid from each row):


then the SS matrix can also be computed as:


But later we will use the fact that this covariance matrix can also be computed as:

SS2 = t(M1)%*%M1-n*outer(centr,centr)

where (recall):


This is particularly useful for big-data computations since t(X1)%*%X1 can be computed for each data chunk separately via a map function and then summed up via a reduce step.

Correlation matrix

Once we have an SS matrix (i.e., SS2) we can easily obtain the corresponding correlation matrix by:

R1 = cor(M1,method=c("pearson"))

We can see that R1 is equal to R.

Eigenvalue decomposition

Note that SS is symmetric and hence has 3 real eigenvalues and 3 corresponding eigenvectors. We can compute them by:

ev = eigen(SS)
[1] 58.08193 46.86827 32.83029
            [,1]       [,2]      [,3]
[1,]  0.4511108 -0.5672369 0.6890147
[2,] -0.8798904 -0.4118343 0.2370348
[3,] -0.1493050  0.7131864 0.6848892

Share this article:

This article is from the free online course:

Managing Big Data with R and Hadoop

Partnership for Advanced Computing in Europe (PRACE)