We use cookies to give you a better experience. Carry on browsing if you're happy with this, or read our cookies policy for more information.

Skip main navigation

Download Airline Data on Windows

This screencast demonstrates how to download the data that will be used in this course.
So before we go discussing the data that we gonna work with directly, you’re gonna notice the first time you download the data that you don’t have a program that can uncompress it once it’s downloaded. So let’s go get an uncompression program first. You might already have one but chances are you don’t. For instance, let’s go to www.7-zip.org. This is one place you can get an uncompression program that’ll be able to uncompress the file we’re gonna download. So I’m gonna click on the 64-bit downloader here. I’m gonna save that file.
And I’m gonna install this app in my program files. Again, you may have a different one that you prefer, but you need something that can open ZIP files.
So once you’ve got that downloaded and installed, we can go take a look at the data expo file that we’re gonna look at. You can just go to data expo 2009 in Google if you like or if you want to go directly to the website you can put in stat-computing.org/dataexpo/2009. Now the dataexpo is now sponsored by the American Statistical Association, they have data expos every few years. They’re not offered every single year, but every two or three years, a data expo is offered. And the data expo from 2009 talks about airline on-time performance. It’s got about 29 variables from many many many flights. Gigabytes and gigabytes of data here. We’re just gonna download one years worth today.
So that you can see how to open the data inside R. So once you’ve gotten to the data site for the date expo 2009, you can click on the link to the data here and you can download the data for a particular year. You’re gonna notice that the data isn’t in CSV format, it’s in a compressed ZIP file. So the file we’re gonna download will be called 2008.csv.bz2 and that uncompression program is going to let us unzip the program so we just have a regular CSV file but let’s click on the 2008 file and save it.
You’ll notice this is large data. It’s 100 megabytes even when it’s compressed. So it might take just a little time to download to your computer.
I’ve already downloaded it once earlier and that’s why mine says 2008 with a parentheses one afterwards because I’ve already got a 2008.csv.bz2 file in that same directory.
Okay, our file is been downloading so I’m gonna go to that folder. Gonna click on 2008.csv.bc2 file. And I’m right-clicking now, I’m gonna click on the right mouse button and I’m gonna open with 7-Zip. Okay, I’m gonna open that archive. There’s the CSV file that I want. So I’m gonna single click on it and extract it.
It says Copy to my Downloads folder, that’s great. It’ll show you the progress as it’s uncompressing. See, if I just downloaded the raw CSV file over the web, it would take a lot of time, and so that’s why it comes in a compressed format.
Okay, so the file is uncompressed now. Let’s go compare here. If I single-click on this file that I had on the zipped file. And leave my mouse over it, it’ll show me that that file is 108 MB. If I click on the file I’ve just uncompressed there and mouse over it, it’ll show me that’s 657 MB, more than half a gigabyte of data, just for the 2008 airline flights that we’re going to work with. It’s pretty impressive that you’ve got a factor of more than six when the file was compressed and downloaded.
Download the data set here: http://stat-computing.org/dataexpo/2009/
In the comments below, describe what type of data you are most interested in learning about in terms of airline flights (e.g. how many flights landed in Chicago over the course of a year).
This article is from the free online

Introduction to R for Data Science

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education