# How to Create and Manipulate Variables and Vectors in R

## What is a Variable and what is a Vector

**Definition of a variable**Variables are objects in R that you can use to store values. It can consist of a single value, basic or complex arithmetic operations, or even be more complex such as a column in a data matrix or a data frame. We will see these complex forms in the following steps of this course.

**Definition of a vector**A vector is substantially a list of variables, and the simplest data structure in R. A vector consists of a collection of numbers, arithmetic expressions, logical values or character strings for example. However, each vector must have all components of the same

**mode**, that are called

**numeric**,

**logical**,

**character**,

**complex**,

**raw**.

## How to create and manipulate Variables

**Step 1**. We recommend you to work in the same working sub-directory that you created previously for analyses conducted with R. If the sub-directory is not created yet or mistakenly removed, please do it again, and launch R

```
$ mkdir exerciseR$ cd exerciseR
$ R
```

**Step 2**. If you forgot before launching R, there is another option you can use to make sure to

**set**the correct

**working directory**using the “

**setwd()**” command, and then check your position using the “

**getwd()**” command (this will also be helpful in RStudio). You can use “

**getwd()**” in R as you used “

**pwd**” in Unix Launch R

```
$ R
```

```
> setwd("/Users/imac/Desktop/exerciseR")
> getwd()
[1] "/Users/imac/Desktop/exerciseR"
```

**Step 3**. Let’s create a simple variable called x. We need to assign elements to this variable. The

**assignment**to a variable can be done in 2 different but equivalent ways, using either the “

**<-**“ or “

**=**” operators. You can retrieve the value of x simply by typing x

```
> x <- 3 * 4 + 2 * 5 + 3
> x = 3 * 4 + 2 * 5 + 3
> x
[1] 25
```

**Step 4**. Let’s create another variable called

**y**that can either contain a new value or for example contain a basic or more complex operation on the first variable

**x**

```
> y <- x^4 - 4*x + 5
> x
[1] 390530
```

**Note 1**. Naming a Variable is not trivial and must be done appropriately:

- Variable names
**can contain**letters, numbers, underscores and periods

- Variable names

- Variable names
**cannot start**with a number or an underscore

- Variable names

- Variable names
**cannot contain**spaces at all

- Variable names

```
> x.length <- 3*2
> x.length
[1] 6
> _x.length <- 3*2
Error : unexpected input in "_"
> 3x.length <- 3*2
Error : unexpected symbol in "3x.length"
```

**Note 2**. Long Variable names are allowed but must be formatted using:

- Periods to separate words:
**x.y.z**

- Periods to separate words:

- Underscores to separate words:
**x_y_z**

- Underscores to separate words:

- Camel Case to separate words:
**XxYyZz**

- Camel Case to separate words:

```
> x.length <- 3*2
> x.length
[1] 6
> x_length <- 3*2
> x_length
[1] 6
> xLength <- 3*2
> xLength
[1] 6
```

## How to create and manipulate Vectors

**Step 1**. A vector can be created using an in-built function in R called

**c()**. Elements must be comma-separated.

```
> c(10, 20, 30)
[1] 10 20 30
```

**Step 2**. A vector can be of different modes: numeric (and arithmetic), logical, or can consist of characters

```
> c(1.1, 2.2, 3.5) # numeric
[1] 1.1 2.2 3.5
>
> c(FALSE, TRUE, FALSE) # logical
[1] FALSE TRUE FALSE
>
> c("Darth Vader", "Luke Skywalker", "Han Solo") # character
[1] "Darth Vader" "Luke Skywalker" "Han Solo"
```

**Note**. Please note that when the value is a

**character**data type, quotations must be used around each value, such as in “Han Solo”

**Step 3**. A vector can be assigned to a variable name, using 3 methods: either using the “

**<-**“ or “

**=**” operators or the assign function. You will very rarely see the last method which is to revert the order of assignment

```
> assign("x", c(10, 20, 30))
> x
[1] 10 20 30
>
> x <- c(10, 20, 30)
> x
[1] 10 20 30
>
> x = c(10, 20, 30)
> x
[1] 10 20 30
>
> c(10, 20, 30) -> x
> x
[1] 10 20 30
```

**Step 4**. In R, an object must be defined by properties of its fundamental components, such as the mode, that can be retrieved by the function “

**mode()**” and the length by the function “

**length()**”. An empty vector can be created and may still have a

**mode**

```
> v <- numeric()
> w <- character()
```

```
> mode(x)
[1] "numeric"
> mode(v)
[1] "numeric"
> mode(w)
[1] "character"
>> length(x)
[1] 3
```

**Step 5**. Basic operations with numeric vectors

```
> x <- c(10, 20, 30)
> x
[1] 10 20 30
> 1/x
[1] 0.10000000 0.05000000 0.03333333
```

**Step 6**. A vector can be used in arithmetic expressions and/or as a combination of existing vectors

```
> x <- c(10, 20, 30)
> y <- x*3+4
> y
[1] 34 64 94
> z <- c(x, 0, 0, 0, x)
> z
[1] 10 20 30 0 0 0 10 20 30
> w <- 2*x + y + z
> w
[1] 64 124 184 54 104 154 64 124 184
```

**Step 7**. A vector can use built-in functions in R, such as

**mean()**to calculate the mean of a certain object (here x),

**var()**to calculate its variance, and

**sort()**to sort the content here of object z.

```
> mean(x)
[1] 20
> var(x)
[1] 100
> sort(z)
[1] 0 0 0 10 10 20 20 30 30
```

**Step 8**. R uses built-in functions and operators to generate regular sequences. Here are examples of how to use

**rep()**to repeat items (arguments needed are the value to repeat and the number of repeats) and

**seq()**(arguments needed are the start, the end, and the interval) to create a sequence of items.

```
> a <- c(1:10)
> a[1] 1 2 3 4 5 6 7 8 9 10
> b <- rep(a, times=2)
> b[1] 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
> b <- rep(a, each=2)
> b[1] 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10
> c <- seq(-2, 2, by=.5)
> c
[1] -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0
```

**Step 9**. The content of a vector can be compared to another using basic operators

```
> x==x
[1] TRUE TRUE TRUE
> x==y
[1] FALSE FALSE FALSE
> x!=y
[1] TRUE TRUE TRUE
```

**Step 10**. The content of a Vector can be easily queried and modified. For this it is possible to use Index Vectors to subset some elements of an existing vector, using square brackets

```
> x
[1] 10 20 30
> x[3]
[1] 30
> x[3] <- 50
> x
[1] 10 20 50
> length(x)
[1] 3
```

**Step 11**. For Index Vectors of character strings, a “names” attribute can help identify components and query the data.

```
> dairy <- c(10, 20, 1, 40)
> names(dairy) <- c("milk", "butter", "cream", "yogurt")
> breakfast <- dairy[c("milk","yogurt")]
> breakfastmilk yogurt10 40
```

## Discussion

Now try it yourself and discuss in the comment area below:**Question 1**. Did you manage to create and manipulate Variables?

**Question 2**. Did you manage to create and manipulate Vectors?

## Exercise

Let’s try it !**Question 1**. Could you create 3 vectors:a vector x containing the numbers 3, 10 and 30a vector m containing the content of x repeated twicea vector n containing two copies of x separated by a 0

**Question 2**. Is the content of m equal to the content of n?

**Question 3**. Note that you should also obtain a warning message because the 2 vectors are not of the same length. How can you check the length of both vectors?

#### Bioinformatics for Biologists: An Introduction to Linux, Bash Scripting, and R

## Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.

You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education