Want to keep learning?

This content is taken from the Coventry University's online course, Get ready for a Masters in Data Science and AI. Join the course to learn more.

What if we need to store something other than a number?

A variable can store many types of different information, not just numbers. Knowing the different types of values, how we operate on them and how they interact with each other is very important in data science.

Mixing these data types and operating on them can lead to some pretty big errors. It could even lead to huge inaccuracies in any calculations you perform if you mix them up. For example, we can find the answer to 4.5 * 2 pretty easily, but trying 8+"Hello" would crash your program.

Integers (ints)

We saw an integer earlier when we stored our 12 apples into a variable. Integers are signed whole numbers. Signed means they can be either positive or negative. Whole numbers means they cannot be fractional like 4.223 or 1/2. Integers are mostly used for when we need an absolute number, eg storing the user’s age or the stock count of certain items in shops.

positiveInteger = 5
negativeInteger = -5062
reallyBigInteger = 123534232343

Floating points (floats)

Floating point numbers are exactly like Integers, except they can store fractional numbers. Floats are important in coding as they can give you more accurate and precise answers than Integers can. For example, when dividing, you will be able to see the fractional remainder.

positiveFloat = 1.2
negativeFloat = -3.354

Strings

Strings store a contiguous set of characters which can be declared inside of either single or double quotation marks. Strings are important as they give us a way of storing text and special characters within a variable. We can’t store these inside Integers or floats! They also give us a fantastic way of storing qualitative data from our data sets. Some examples are listed below:

singleQuoteString = 'Hello World!'
doubleQuoteString = "DSAI is the Best!"
singleLetterString = "A"

For rare cases, you may want or need to format your string across multiple lines. This is possible with two methods, either by using a Newline identifier (\n), or by declaring your string with three double quotation marks """. So, if we wanted to store the sentence "I can be multiline!", and move each word to a new line, we could declare it like so:

multiLineString = "I \nCan \nBe \nMultiline!"

OR

tripleQuoteString = """I
Can
Be
Multiline!"""

Booleans (Bools)

Booleans are the simplest variable type. They can only be either True or False. They can also be declared with either a binary 0 (False) or a binary 1 (True). Booleans also help improve the readability of our code since reading a word (True or False) is easier than interpreting a number. Some examples are below:

isPowerOn = True
userLoadedCorrectly = False

OR

isPowerOn = 1
userLoadedCorrectly = 0

Arrays of variables

Sometimes we will need to store more than one instance of information inside of a variable. We can do this using arrays. Arrays are collections of data. The values themselves within the arrays can be changed.

Using arrays, we are able to store lots of information inside a single variable. I’ll start off with an example and we’ll work through each part of it:

listOfNames = ["Daniel", "Will", "Brian"]

Above we have a variable named listOfNames which contains three strings - Daniel, Will and Brian.

We name our array like any other variable, making sure it is descriptive of what it is going to store.

We then surround our values with square brackets ([,]) and separate each one with a comma (,).

It’s as simple as that. But one question arises, if we can retrieve a single value from a variable by calling just its name, how do we retrieve just one of the values from our array?

Indexing

We can use a technique called indexing to retrieve a specific value from the list. If we wanted to retrieve the first name in the list that we created above, we’d have to use the index operator ([]) on the variable like so:

listOfNames = ["Daniel", "Will", "Brian","Angelique","Laura"]
print(listOfNames[0])
Daniel

Arrays are “Zero-Indexed” meaning, when retrieving values from the array, we start with 0. (First value [0], Second Value [1], Ninth Value [8] etc). We can also use a negative number as our index, to select values in reverse order ([-1], retrieving the last value in the list, [-2] the second last etc).

The table below shows what we would retrieve from the list with certain index numbers.

Index 0 1 2 3 4
           
Value “Daniel” “Will” “Brian” “Angelique” “Laura”
Index -5 -4 -3 -2 -1

However, we must be careful when indexing a list. If we try to retrieve a value that doesn’t exist, our program will crash. An example of this is, if we tried to retrieve a sixth value from our list above. Since there isn’t actually a sixth value, our program will crash, as shown in the image below:

Index Out Of Range: screenshot of code showing a Python error generated when trying to reference the sixth name from a list of five names - IndexError: list index is out of range

Your task

Time to put this into practice.

In Jupyter Notebook we’ve prepared an exercise looking at creating your own arrays and testing your knowledge on indexing.

Complete the exercises contained in the Jupyter Notebook zip file, available in the downloads area.

Hint: the answers are hidden at the bottom of the notebook if you get stuck.

Share this article:

This article is from the free online course:

Get ready for a Masters in Data Science and AI

Coventry University