We use cookies to give you a better experience. Carry on browsing if you're happy with this, or read our cookies policy for more information.

Skip main navigation

Incorporating Auxiliary Data about Airports

Watch Dr. Mark Ward discuss auxiliary data about airports.
Another thing we commonly wanna do when we’re doing data analysis is bring in another source of data. So I’m gonna go back to the Data Expo 2009 website. You’ll see the same link we used before to download the data. But there’s supplemental data sources, which is often a great idea when you’re doing data analysis. To see what other kind of data you can bring into the picture to give you a little more insight into your data. So there’s an airports.csv file. I’m gonna hold in Ctrl since I’m on the Mac, and click on it. And then it’s gonna let me download the linked file as.
Now if you are on Windows, you probably wanna right click on airports.csv and then you can decide where to save it. I’m gonna save it in my downloads folder because that’s where I’ve got everything else. So, I’m gonna save that in there. Now if I come back into R, I should be able to make an airport df and do a read.csv and reference my /users/mark/Downloads/airports.csv file. And I’ve brought it into memory. You’ll see there’s 3,376 rows and 7 columns, okay, and you can see that here too. Okay, so here we’re importing the data about the airports themselves, locations, et cetera.
Let’s go take a look at the head of that data frame. So I’ve got the airport code, the name of the airport, the city and state, the country, the latitude and longitude. And if these IADA codes don’t look familiar here, let’s go look at say the first 100 of them. Okay, maybe we wanna dive further into the list to find the ones that you’re familiar with. There’s more than 3,000 of them all together. Let’s go look in the airports data frame and see if we can find for instance where the IADA codes equal to IND. Okay I’ll put that in there for the row and then I’ll leave the column blank right.
So I have to have two indices to go into the airports data frame. Okay, there’s the information about Indianapolis. Okay, or if I wanna see MD and O’Hare and Midway, which I always like, cuz it’s got my initials over the airport code, there you go. There’s the information for those three airports. Okay, that gives you a sense of things. So, what I’m gonna do now is I’m gonna take the airport names and the city and the state and I’m gonna paste them together into a new vector here, okay? I’ll just call it double U because it’s just a temporary vector, I don’t need it in the long run.
I’m gonna go look at the airport name and at the city and the state, and I’m gonna make the separator here. Now, instead of just by default the space, I’m gonna put a comma in there, a comma and a space for the separator. Okay, so how does the head of w look? It looks like that, that’s kind of nice. Okay, so you made a vector to store the airport name, city and state. Okay and I’m gonna make the names of W be those airport codes like IND, MDW and L. Okay so, I’m gonna make that the airports DFIDA. So now if I go look at the head of W, each entry has its airport code, as the name.
So I can now go look at W and ask it for IND, and ORD and MDW, and any others that I want. And I’ll get the same information, I’ll get the name and the city and the state, but I’ll also have the code there and I can go use the code as indices into that vector. Okay because the names of each entry that are stored there are now the airport codes. That’s the neat thing that I did here. And we’re going to make the name of each entry In the vector be the three letter airport code itself, okay?
So, if I wanna know, for instance I just want the airport that’s got abbreviation CMH, I can just go put that in there and find out that, that’s Port Columbus International in Columbus, Ohio. It’s kind of a neat trick. That’s one way that indexing NR is very powerful. I encourage you to try it, see if you can understand what’s going on there and how we’ve used indices Star Advantage NR.
This article is from the free online

Introduction to R for Data Science

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education