# Basic matrix operations in R

## Introduction

In this article we present the basic matrix operations using `R`

with a particular focus on those operations that have the potential for parallelisation using `map-reduce`

.

## Remark

Note that we assume that you have:

- started
`RStudio`

(by executing`rstudio &`

in the terminal) and - you have opened a new
`R`

script file.

If you have not, then using `ctrl+shift+n`

you start a new script file that you have to save first to a local folder. Once you type (copy) the `R`

code into the script file, you run it by, e.g., selecting the part of the code you want to run and typing `ctrl+enter`

.

## Data

Lets us consider again the following data from Article 3.3. We will work with the following matrix:

```
library(plyr)
set.seed(1000)
M1 <- matrix(rnorm(150,0,1), ncol=3)
colnames(M1)<-c("X1","X2","X3")
```

## Matrix operations

### Identity matrix

An identity matrix is a square matrix with diagonal elements equal to 1 and other elements equal to 0, e.g., we can create an identity matrix with dimensions 3x3 by:

```
diag(3)
```

or by the number of columns in `M1`

(which is 3):

```
diag(ncol(M1))
```

### Matrix multiplication

Suppose we want to compute the product of the transpose of `M1`

by `M1`

. This can be done by:

```
t(M1)%*%M1
```

### Covariance matrix

The covariance matrix of `M1`

we compute directly as:

```
cov(M1)
```

### SS matrix

Likewise, we compute the sum-of-squares and coproducts matrix (SS matrix) of `M1`

by:

```
n=nrow(M1) #number of rows in M1
SS=(n-1)*cov(M1)
SS
X1 X2 X3
X1 42.485852 -6.7437071 -7.3797835
X2 -6.743707 54.7612372 -0.8058014
X3 -7.379783 -0.8058014 40.5334042
```

If we centralise the data (subtract the centroid from each row):

```
M1s=scale(M1,scale=FALSE)
```

then the `SS`

matrix can also be computed as:

```
SS1=t(M1s)%*%M1s
```

But later we will use the fact that this covariance matrix can also be computed as:

```
SS2 = t(M1)%*%M1-n*outer(centr,centr)
```

where (recall):

```
centr=colMeans(M1)
```

This is particularly useful for big-data computations since `t(X1)%*%X1`

can be computed for each data chunk separately via a `map`

function and then summed up via a `reduce`

step.

### Correlation matrix

Once we have an `SS`

matrix (i.e., `SS2`

) we can easily obtain the corresponding correlation matrix by:

```
D=diag(1/sqrt(diag(SS2)))
R=D%*%SS2%*%D
R1 = cor(M1,method=c("pearson"))
```

We can see that `R1`

is equal to `R`

.

### Eigenvalue decomposition

Note that `SS`

is symmetric and hence has 3 real eigenvalues and 3 corresponding eigenvectors. We can compute them by:

```
ev = eigen(SS)
ev$values
[1] 58.08193 46.86827 32.83029
ev$vectors
[,1] [,2] [,3]
[1,] 0.4511108 -0.5672369 0.6890147
[2,] -0.8798904 -0.4118343 0.2370348
[3,] -0.1493050 0.7131864 0.6848892
```

© PRACE and University of Ljubljana