Want to keep learning?

This content is taken from the Kogod School of Business at American University's online course, Business Analytics: The Data Explosion. Join the course to learn more.

Skip to 0 minutes and 2 seconds In the past few sections we’ve been focused on the science of how people perceive the world around them. And in this section we’re going to leverage that science and leverage those insights, and apply it to actually building some great graphs and visuals. And specifically, we’re going to be looking at a line graph and a column graph, which are two of the more common types of graphics used in business presentations. So let’s go ahead and put the line graph up on the screen.

Skip to 0 minutes and 27 seconds And before we dive into taking this default graph and turning it into something that I believe is much more visually compelling, let’s actually try and understand the story of what’s going on here – what’s trying to be told, but may be hidden behind some unnecessary things. What we see here is a line graph comparing the units sold over time for two different stores, one in Los Angeles and one in San Francisco. And when I look at this, the story that I think is trying to be told is that in the first few periods in this reporting window, we see both stores see a decline in sales.

Skip to 1 minute and 4 seconds But the San Francisco store levels out, whereas the Los Angeles store continues on that downward trend throughout the reporting window that we see here. So if that’s the story that’s trying to be told, there’s a lot of unnecessary things. There’s a lot of unneeded cognitive load that sits within this graph that we should try and strip away, to make it much easier for the audience to understand that story. Maybe you can look at this graph and see a few things that don’t need to be there that are just clutter, that are getting in the way of the audience understanding. So take a quick look and see if you can find anything.

Skip to 1 minute and 41 seconds The first thing I might go after is this shaded background. A lot of times when you import a graphic into one of these packages, it will put this shaded background as a way to delineate the graph from the other parts of the slide. Well, the human eye doesn’t need that. The human eye understands where the edges of the graph are. So, it really is just wasted ink, if you will, that is even subconsciously getting in the way of the brain understanding what’s on this graph. So let’s go ahead and remove that background. And the next thing I want to do, there

Skip to 2 minutes and 18 seconds are two lines on this graph that I care about: the units sold over time for Los Angeles and San Francisco, the red and the blue line. That’s it. But if I look at this graph, there’s a lot of other lines on here that don’t need to be there; that the brain, whether you realise it or not, is trying to focus on and trying to put in some perspective. Let’s get rid of some of that. You don’t need your audience members’ brains focused on some of these lines. And the first ones I would go after are the borders. The border around the broad chart, the border around the plot. Let’s get rid of those. Those lines don’t need to be there.

Skip to 2 minutes and 52 seconds And then the next thing I would go after, that most default graphing packages put on in every graph, are these grid lines. Again, there are two lines that I care about, yet, there are nine other lines on this graph that are competing for brain power. So let’s get rid of that. And now you can see how much cleaner the two lines that I care about pop. Right? I can really begin to focus on those two lines. The next thing I tend to try and remove are the tick marks, right? Like, the squares on the red line and the circles on the blue line.

Skip to 3 minutes and 29 seconds Unless your goal is to have your audience be able to very specifically look at each data point along the line and know where it is, you really don’t need these. And that is almost never your intent. With a line graph like this, almost always you want your audience to focus on the trends in these lines over time and the relative position of one line to another, not necessarily the individual data points. So why am I putting all of these data points that, whether you realize it or not, your audience members’ brains are trying to process? You don’t need them to process that. You need them to understand that there’s something going on in the Los Angeles store.

Skip to 4 minutes and 7 seconds So, let’s get rid of those. Much, much cleaner. Addition by subtraction. Okay, let’s keep going, because there’s a few other things on this graph that I think we can still clean up. The next thing I tend to focus on is the legend. Most of the time, when you only have a few series worth of data, you really don’t need a legend. You’re forcing your audience’s eyes to dart from the line to the legend, from the line to the legend. What’s blue? What’s red? You don’t need to do that. Just label the lines like this. So, I am putting the label next to the line, again, using some of these Gestalt principles that we talked about.

Skip to 4 minutes and 47 seconds I’m using the principle of similarity, right? By not only putting it next to the line, but coding it in the same color. So very quickly the brain can say, oh, blue is San Francisco, red is Los Angeles, and there’s no darting around. Much cleaner. So now that we’ve tackled the inside of the graph, let’s actually move around the axes. And let’s start on the y-axis. And another thing that Excel and other packages tend to do that drives me a little crazy, is it puts these zeros to the right of the decimal place.

Skip to 5 minutes and 17 seconds And it does that because if your underlying data has data present to the right of the decimal place, it thinks that’s important to you and, therefore, puts it on your axes. But the problem is, the vast majority of the time you’re putting whole numbers on that axis. So, therefore, you just have a bunch of zeros on there for no reason. Look at all those zeros that are on this page that carry no information whatsoever. Let’s get rid of those. And while we’re at it, we really don’t need to be counting by ones up this axis. Let’s get some of those numbers off the page and make it look a lot cleaner.

Skip to 5 minutes and 51 seconds So now we’re counting by twos up to ten, we’ve removed any of the decimal places, or the data to the right of the decimal place, and that y-axis looks a lot better. So now let’s go after the x-axis. And you can see these date formats on a slant. You never want your audience to have to tilt their head to understand what’s going on in your graph. You should always try and figure out how to encode the data for the labels in that axis in a way that doesn’t have to be slanted. The other thing that’s happening here is there’s a lot of duplicated information. 2013 is printed on here three times.

Skip to 6 minutes and 29 seconds 2014 is printed on here four times, and so on. There is a way we can clean up all of that. So why don’t we do something like this instead? That just lists the quarters, and then the years once underneath it, and a little bolder, so the eye can focus on that bold. Again, we’re using some of those principles that we talked about before. And it just makes everything look a lot nicer. The last thing I would do, going back to – remember, the story we’re trying to tell is to focus our audience on what’s going on in Los Angeles.

Skip to 7 minutes and 2 seconds And if you really want your audience to focus on that Los Angeles line, maybe you have it jumped forward a little more, or have the San Francisco line, in this case, go to the back a little more. And you can use color, again, to do that. So in this case, I take San Francisco from blue to grey, and it sort of falls back a little bit. So now, my audience can immediately see what I want them to see – what’s going on in Los Angeles. All of that wasted stuff has been removed. And we’ve transitioned to a graph that once looked like this, and now looks like this.

Skip to 7 minutes and 37 seconds Not only is it much, much better aesthetically, it actually is leveraging the science that we talked about to enable your audience to very quickly get at the thing you want them to get at, without you even saying a word. So hopefully that’s helpful in the way you think about line graphs. Let’s transition into looking at another type of graph that’s pretty common, the column graph. So, just like we did before, let’s put our graph up here and orient ourselves to the story that’s trying to come out of this graph, even if it’s being clouded by a bunch of unnecessary things. And here we see the average sale per customer by a number of stores, A through J.

Skip to 8 minutes and 25 seconds And you can see that there’s a pretty wide variation between stores that have a very high average sale per customer, and a low average sale per customer. So, I think the story is trying to understand what’s driving those high store sales and low store sales by customer, but there’s a lot of things getting in the way of being able to understand that here. First thing I would focus on here is the fact that these columns are depicted in three dimensions. I can almost never think of a scenario where you need to show a graph in three dimensions. It doesn’t add any value. And in fact, it actually is distracting.

Skip to 9 minutes and 1 second If I’m an audience and I’m looking at these three dimensional columns, should I focus on the front? Do I focus on the back? What’s going on there? Just get rid of it. And I’m not going to belabor a bunch of the things that I talked through on the line graph, right? Like, we don’t need these borders, we don’t need the axes line or the grid lines. We should get rid of the zeros to the right of the decimal place. Let’s just make all of those changes in one fell swoop. And now we can focus on some things that are more intrinsic issues I find in column graphs.

Skip to 9 minutes and 33 seconds The first on this one that I’m going to focus on, is the fact that the x-axis, that the columns themselves, are actually ordered alphabetically by store. Given the story we’re trying to tell, it’s not actually relevant the alphabetical nature of the stores themselves. What I care about is the different levels of sales, right? Like, I care about the value on the y-axis for each of the stores on the x-axis, not the alphabetical nature of the stores. Excel does that by default. A lot of times when you pull data out of a data warehouse, maybe it’s organized alphabetically by store, and you just paste it in, and then draw your graph, and this gets built.

Skip to 10 minutes and 13 seconds So you need to rearrange the underlying data. But we should arrange it in a way that bangs home the point we’re trying to make, which is there’s a lot of variability between the highest average sale per customer in store B and the lowest in store H, right? And this allows you to see that much more cleanly. The next thing I’m going to go after is probably the first thing everybody noticed, which is these atrocious colors. A lot of times when you have different data points, like stores, in your series on a column chart, some packages will color them differently, so you can distinguish between one and another.

Skip to 10 minutes and 52 seconds It is really jarring, and that is not a place where you need to use colors. So let’s start by pulling all of those back into just one color. So now we can focus on the trend, and not be distracted by the colors that are there. But in my mind, we can go a step further than that. Maybe a couple steps. If what we’re trying to show is look, we’ve got a couple of stores, B and G, that are doing really well and have some very high average sales per customers, and a couple of stores that aren’t doing so well in D and H. And then we have a bunch stores in the middle.

Skip to 11 minutes and 23 seconds We can use color to our advantage here, right? We can use those pre-attentive attributes that we talked about. We can use similarity, right? We’re grouping blue, and we’re grouping grey, and we’re grouping red. And we’re saying these are stores that are high, these are stores in the middle, these are stores that are low. And now very quickly our audience can understand what’s going on here. And I’m going to go against the trend a little bit here in this last piece, and I’m actually going to add some data that I think can be useful in situations like this.

Skip to 11 minutes and 56 seconds If you’re trying to show that you’ve got some that are around the middle, and some that are higher, and some that are lower, sometimes it can be useful to actually add an average line. So, now very easily, the audience can see, oh yes, I see that those lines are about the average and these are higher and these are lower. Notice that I’m also – I’ve encoded that line and the label of that line in the same color as the bars around the middle. So, the eye immediately sees them as all going together, and therefore those columns are right at the average.

Skip to 12 minutes and 29 seconds So we’ve gone from a graph that is annoyingly complex, to something that is very easy to understand what’s going on.

Building great graphs: line and bar charts

In this video, we look at ways to simplify and improve two different types of visual: line graphs and bar charts.

Professor Rinehart will show how we can reduce cognitive load by removing unnecessary information from these charts, allowing us to tell a much more compelling story with our data. Guidance on how to make these changes is provided in the next step.

We cover a lot of information in this video, so you might want to watch it in two parts. After watching the first section (where we discuss line graphs), pause for a moment to reflect on what we’ve discussed, then watch the second section. If you have any thoughts or questions about the video, share these with other learners by adding a comment.

Share this video:

This video is from the free online course:

Business Analytics: The Data Explosion

Kogod School of Business at American University

Get a taste of this course

Find out what this course is like by previewing some of the course steps before you join: