We use cookies to give you a better experience. Carry on browsing if you're happy with this, or read our cookies policy for more information.

Skip main navigation

Identifying Properties

This video demonstrates how to identify how many observations have given properties.
All right, let’s dive a little bit deeper into those Indy flights. We saw that among the first six flights, the third, fourth, fifth, and sixth of them started in Indy. Okay, let’s see how many flights started in Indy altogether. The neat thing about R as compared to lots of other computing platforms is you don’t have to write a loop to look through every single element in some kind of vector or some kind of data structure that you wanna study. R is vectorized, meaning R is naturally really good at running functions that operate on every single element of some kind of vector or some other kind of data structure. R has very powerful functions.
So for instance, let’s look at the origin cities and see which one of them are equal to Indy. I don’t wanna see all of the resulting values. I just wanna see the first six things that come from this command. So what am I gonna go do? I’m gonna take the origin column out of the data frame, take that whole column of origins. And see which ones are equal to Indy, to Indianapolis. I use a == to check and see if they’re equal or not. And I just go through, is the first flight coming from Indy? Is the second flight coming from Indy? Is the third flight coming from Indy?
And so on, and I go through all 7 million flights and check. And you notice R ran it almost it immediately, almost no delay. Indeed, the first two are not coming from Indy, right? They were coming from Dulles. And the third, fourth, fifth, and sixth ones were coming from Indy. Now if I try and sum up those values, you say to yourself well, how could you sum trues and falses? Very commonly, when we’re computing in many different environments, not just in R, falses get converted to 0s and trues get converted to 1s, if we try and add something up. Let’s take this previous command, and instead of doing the head, we’ll do a sum.
And I’m gonna make myself a note here before I run that. I’m gonna make myself a note here that false values are converted to 0s. True values are converted to 1s. So sum just adds up the total, which yields the number of flights departing from Indy in 2008. So now I go back up to this line, Cmd+Return to run it. 42,000 of our 7 million flights departed from Indianapolis in 2008. That is neat and that didn’t take hardly any time. I didn’t have to write a loop, I just asked R, and R let me know. You would think that there would be a similar number of destinations that would be Indy as well.
So, we could go check that very easily too. You’ll notice I’m just copying and pasting my code. When I highlight a line of code, I’m holding in Shift and hitting the down arrow to highlight. And then I’m using Cmd+C to copy and Cmd+V to paste. You can see that under the edit here, that copy is Cmd+C and paste is Cmd+V. I tend to use shortcuts when I’m typing, so I don’t have to use my mouse too much. Let’s go look at how many of the destination cities are Indy. So I just changed the origins, the destination’s there. I go ask R almost exactly the same number.
I mean, we fully expect that the number is slightly different because a few airplanes may or may not still be sitting in Indianapolis. A few more of them at the beginning of the year versus at the end of the year. And sometimes airlines are taken in and out of service, but a really similar number of them landed in Indianapolis, as compared to the number of them that departed from Indianapolis. That’s pretty neat. So we’re already getting our minds wrapped around how R works, and our hands kind of used to diving into the data and answering some questions about the data set that we’ve got.
Note: If you use capitalization with a command, you will receive an error. Avoid starting a new line with a capital letter.
Notice the number of flights that departed from Indianapolis (IND) in 2008. Does the number surprise you? Search for the number of departing flights from another US city. First, try to guess how many there will be. Then, see what the actual number was. What was the difference in your guess and the actual total? Refer to http://www.airportcodes.org/ for a list of airport codes.
Add a comment below to discuss!
This article is from the free online

Introduction to R for Data Science

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education