Want to keep learning?

This content is taken from the Purdue University & The Center for Science of Information's online course, Introduction to R for Data Science. Join the course to learn more.
4.12

Purdue University

Skip to 0 minutes and 12 seconds Almost nothing that I did in the last video was specific to Indiana. So now let’s build a function, that for a state Given by the user, will identify all of the airports in the 2006 and 2008 data set that have commercial flights from that state. Okay and we’ll start with IN as the state example because we know how that should come out. Okay, so I’m gonna first of all just make my state be IN. I’m gonna go back through the essential parts of what I did up above and tease out what I really need here from what I did. So this is one thing I needed was I had to look in my information about the airports.

Skip to 1 minute and 0 seconds And find just the airports that were in my state, in the state of Indiana. So I’m gonna copy that line. I’m gonna copy and paste that down here and I’m gonna change this form Indy airports to my airports because I want to build something generic that can work for any state. Instead of specifying that the state was IN, I’m gonna call that mystate. Okay, so now I’ve got those airports. If I wanna go take a look at them I can. We’ve done this before, we see these are all the airports in Indiana. Many of which don’t have commercial flights, as we saw.

Skip to 1 minute and 32 seconds Now let’s go back and look at the table that we built from this. We want to do something similar to that. We want to use all of those air ports codes the three letter codes. There iadea codes from this data frame. Make all those into characters. And then use that as an index back into another table. That we’re gonna build. And what table are we going to build. We’re going to go into our data set, And build a table of all of the commercial flights for each airport. For each airport, how many flights were there? And then we can go take a look. For instance, before we do that, let’s just test and make sure this is gonna work, okay?

Skip to 2 minutes and 11 seconds Let’s say for instance that we wanna be looking at Illinois and we wanna find O’Hare. We want to be looking at Ohio and we want to find Columbus, for instance. Okay, now, we want to put in all of the airports for our state and go see what comes out. Okay, for instance, since right now our state is Indiana. When we look at V, we’re just going to go find those four airports we already identified earlier. And then the last thing that we did, was we just took a subset of information from our airport data frame, and we found which airports that corresponded to. In our case, it’s just going to be those four airports from Indiana.

Skip to 2 minutes and 50 seconds So now, what I’m going to do. Here, let’s just say succinctly, this summarizes what we did in the previous video.

Skip to 3 minutes and 2 seconds Now we’re going to go wrap that into a function that lets us do that for any state. And the only thing we got to change here is just what my state is. Okay, so, I am going to copy these. This is the first time we’ll have built a function in are and it’s just something good to know how to do. So let’s make our function be called active airports. Okay, this is gonna find all the active airports in our state. We send it whatever state we want, we put open braces and in here we put the code. Okay, so let’s look at what is going on.

Skip to 3 minutes and 35 seconds This is the name of my function, the name of my function Am creating is called active airports cause that’s what it finds. Because is the information it finds. It only needs one input, namely the state to look in. And what does it do? It goes and extracts the information about the airports in that state, finds the airport codes, translates those codes into characters. It goes looks and looks in our data set for the airports that are in our data set like that. And then once it finds those airports in our data set where there’s commercial flights it comes back to the airport’s data frame again and goes and gets all the information we know about those airports.

Skip to 4 minutes and 21 seconds And ignores all the rest of them. Okay, so if I want to run a function in r, I have to highlight it all, and then run them all simultaneously. Okay, let’s try active airports, first of all just putting in IN because we think that should work just for the Indiana airports. Seems to work well, For the Indiana Airports. Now lets go try it for instance for Illinois. How about Illinois? Does it work? It sure does. Look, for instance there is Chicago O’Hare when we know. Okay, and Chicago Midway is there. And there’s several other airports in the state of Illinois that do have commercial flights, and we find them all. What about California?

Skip to 5 minutes and 13 seconds There’s a lot in California that actually have active flights. You notice there’s not 200 there just like in Indiana, there weren’t 60 some airports with commercial flights. Many of the airports in California don’t have commercial flights, but when we look at the list the ones we recognize there are ones that have commercial flights. And now we’ve got a function that can automate this for us. Pretty neat.