Want to keep learning?

This content is taken from the University of Dundee's online course, Data Science in the Games Industry. Join the course to learn more.

Skip to 0 minutes and 4 seconds SPEAKER: Relational databases are fantastic at storing data, in particular keeping it consistent and accurate. However, because of the technical design of this class of storage, it can be a problem to store the same data in multiple places at the same time and continue to keep the data consistent and accurate. Let’s think about two databases, one in New York and one in London. These are connected together and interrogated by applications in America and the UK. However, what happens if the same data with different values is written to the databases on both sides of the Atlantic? What will be the correct value to store? For instance, what if two people try and register the same username in both locations.

Skip to 0 minutes and 51 seconds Who gets the username? This will be made worse if the network between the two systems breaks. For a short time, both users are registered by both systems. When the network is fixed, which is the correct answer? This is called consistency. Both systems are inconsistent when the network is broken. As a designer, we need to decide what to do in this situation. Should we allow both systems to give different answers to their applications, or should our systems refuse to give an answer at all? We’re going to look at why this is such a problem, and how it can occur. Later in the week, you look at how modern systems can overcome this problem.

Skip to 1 minute and 36 seconds Although they can never fix it completely, they can be pretty good at making it look like there’s no longer a problem.

The CAP triangle

In the diagram below we can see that there are three relational databases in three locations.

We would like each of these databases to serve the same information to the client computers attached to each of them.

However, each of the clients may be adding and updating information to the data stores, each of the relational databases will be consistent (they will obey the rules of referential integrity internally) but how can we make sure that each database is updated with information from the other databases.

Before we can do that, we need to understand how each of the databases ensure integrity. How can each database ensure that it is internally consistent?

To do that we must understand transactions.

Share this video:

This video is from the free online course:

Data Science in the Games Industry

University of Dundee

Get a taste of this course

Find out what this course is like by previewing some of the course steps before you join: