Skip main navigation

How to structure data in CSV files

CSV is a common and simple data structure that works well and offers the ability to store complex data in a way that is simple to understand.

Most of us have used a spreadsheet to record data. It is one of the most basic tools used for analysing financial and scientific data. But have you ever used comma-separated values (CSV) files?

CSV files store data in a structure that can be used (read and written to) by a range of different spreadsheet software packages. Python programmes can also interact with CSV files; you can use a Python programme to save data in a CSV file, and later open that same CSV file in a spreadsheet. You can also use the data from a CSV file directly in Python to visualise data.

What is a CSV file?

Here’s a CSV file opened in a text editor:

Image of a text editor, Notepad.exe, showing the contents of a CSV file. Each line shows 2 values, a name and a type of food, seperated by a comma. e.g. Sarah,Cheese

CSV files contain data structured so that a comma separates individual items in the file, and each record is on a new line of the file.

In the image above there are two columns, Name and Food, and on each row there is the name of a person and their favourite food.

In a spreadsheet application (such as Excel or Google Sheets), use this data to create a spreadsheet.

Name Food
Sarah Cheese
Dexter Chicken
Li Burgers
Amy Ice cream
Les Lasagne

How to export a spreadsheet as CSV

Saving a spreadsheet to a CSV file is straightforward, so most spreadsheet applications have this feature. You can find the option in the File menu. Save the file as foods.csv.

 

Image of a spreadsheet application, Google Sheets, with the option Download > Comma Separated Values select from the file menu

 

Open foods.csv in a text editor and take a look at the contents.

You can see that the structure of the raw CSV file is the same as the data in the spreadsheet. There are field names on the first row to identify what each column refers to. Between each item on the rows there is a single comma, which is a delimiter, a way of splitting data into columns. This is used to separate the name of the person from their favourite food. For the next person, the CSV file uses a new line to identify that this is the start of a new line of data.

CSV is a common and simple data structure that works extremely well and offers the ability to store complex data in a way that is simple to understand.

This article is from the free online

Programming 103: Saving and Structuring Data

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now