Skip main navigation

The history of Data Science

This article looks at the history of data science, its rise within educational institutions, and the key contributers to the field.
Finger touches Abstract symbols on screen
© Shutterstock

Applied mathematics and statistics are not new concepts.

Governments, councils and organisations have been using them for centuries. The Romans used censuses (i.e., statistics to derive information about the overall population from a small sample of data). Since the 17th century, insurance companies have used actuarial science to determine the premium for a policy.

Forecasting and estimating methods have also been used for some time by governments and organisations alike. Exit polls have been of interest to the wider population to obtain an early prediction of election results. However, in the last decade, there has been the rapid emergence of new science, data science, in organisations and academia.

What is data science?

Data science can be defined as the use of applied mathematics or statistics, technology or computer science and domain expertise to derive useful information and insights from data.

According to Hal Varian, the Chief Economist at Google and a UC Berkeley professor, ‘the ability to take data—to be able to understand it, to process it, to extract value from it, to visualise it, to communicate it—that’s going to be a hugely important skill in the next decades.

The evolution of data science

In 50 Years of Data Science, Donoho explores the evolution of data science and lists the key contributors to the field, including John Tukey, a renowned mathematician, who contributed to the initial development of applied mathematics and statistics in academia and commercial fields. In the late 1960s, Tukey called for reform in academic statistics.

In The Future of Data Analysis (Tukey), Tukey recognised the existence of new science that deals with data analysis and learns from data.

Lecturer and assistant with Screen

Over the past 20 years, mathematicians and statisticians, such as John Chambers, Bill Cleaveland, Leo Breiman and Jeff Wu, have emphasised and contributed to the development of data science and pushed the boundaries beyond statistics.

Notably, John Chambers contributed to the development of the programming language S and R, which are open source data science tools that revolutionised the use of data science. Chambers also emphasised data preparation and presentation.

Breiman (University of California) contributed to the development of classification and regression trees that paved the way for the development of various machine learning techniques.

He also focused on generating predictions from data rather than merely using data to draw inferences. Cleaveland and Wu conceptualised data science; however, DJ Patil from LinkedIn and Jeff Hammerbacher from Facebook both claimed credit for coining the term ‘data scientist’ and popularising it in the wider industry.

Universities and data science

Data science has developed over the last 50 years; however, the last decade has seen a phenomenal rise in the field of data science. In 2012, not one university offered any formal education in this field.

Today, most universities have dedicated courses on Data Science. Further, its concepts and applications are taught in humanities, agriculture and environmental studies.

Specialised data science departments

Today, most organisations have specialised departments dedicated to data and new roles, such as the Chief Data Officer (CDO), Chief Data Scientist, Head of Data and Insights and Data Scientist, which are extremely popular.

In 2012, in an article entitled ‘Data Scientist: The Sexiest Job of the 21st Century’, the Harvard Business Review ranked data science as the sexiest job of the 21st century.

According to Glassdoor (‘Best Jobs in America’, 2020), a data scientist is ranked number three of the ‘50 best jobs’ in America today. So, what led to its exponential growth?

Hand and tablet with data


Data science is the new water

A decade ago, data was described as the new oil. Today, it is referred to as the new water. The advancements in technology in the digital age have contributed to the exponential growth in data science.

The miniaturisation and commoditisation of computers have further fuelled this growth. Every digital system today produces terabytes and petabytes of data. IDC research predicted that the collective sum of the world’s data will grow from 33 zettabytes (ZB) in 2018 to 175ZB by 2025; a compounded annual growth rate of 61 per cent (Patrizio, 2020).

Gordan Moore, the founder of Intel technology, developed a law (i.e., Moore’s law) that states that the computational power doubles roughly every two years.

This is achieved by doubling the number of transistors in a microprocessor circuit, which is known as miniaturisation. To put things in perspective, our smartphones today have more computational power and storage than that of a personal computer 20 years ago.

© Torrens University
This article is from the free online

Introduction to Digital Transformation: Understand and Manage Digital Transformation in the Workplace

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now