Skip main navigation

New offer! Get 30% off your first 2 months of Unlimited Monthly. Start your subscription for just £29.99 £19.99. New subscribers only. T&Cs apply

Find out more

Plan the data analytics phases

Video on planning the data analytics phases
(gentle music) In this video, we would be looking at some phases typically followed by analysts conducting a detailed data analytics procedure. First let us look at objectives. The first step is to set your objectives. These will be questions you want answered, or outcomes you want from analysing. They are set based on some initial questions. Some examples across different domains might be, “Which of our customers are most profitable? Which of our webpages draw the most traffic? What is the cause of changing weather conditions?”. The related objectives that we set would be, “Determine if the number of customers related to the brand they buy. Determined if the number of visits per page, is related to customer interests.”.
Our data analytics phase will begin by gathering any data that might be relevant. You can make use of this tool, to convert online tables such as from Wikipedia, into a CSV. In an actual analytics project, getting the data straight from the official source or sources, is much better. Next up is, preparation. The next phase is to prepare the data, that is, transform it into the right format or formats. For example, for our scenario in this topic, some information about land area was in hectares, and some in square kilometres. This has been normalised so that all area data in CSVs, are in square kilometres.
In Python, you can do this while loading data, and dividing the area in hectares by 100, to get the area in square kilometres. You’ll see also during exploration, that we interpolate or extrapolate some of the data for years that we don’t have records for. Another important step is modelling. This encompasses the exploration of the data, and discovering the relationships. During this phase we can also interpolate and perform cleaning, or aggregating of data, to get it into the right shape. For example, we will calculate the mean number of fires per year for each state, as part of the modelling phase. We can then use this data in looking for patterns and relationships. The last and final would be, communication.
After we’ve modelled our data and discovered relationships, or lack thereof, we should communicate our findings to stakeholders. This will involve writing about our methodology, listing any assumptions, creating visualisations, and giving your conclusion and further recommendation. Sometimes we can’t draw any conclusions from our data sources, and this is okay. It’s better to say that we couldn’t find a conclusive link between parameters, than trying to massage the data, to fit into a preconceived idea.

Watch this video here to revisit and explore the phases of data analytics.

The phases explained in the video will be more like a refresher for you if you already are aware of the CRISP-DM method from the previous part of this course or otherwise. Alternatively, we will look at some of the stages in the later part of this week as well.

Revisiting the phases of data analytics will help you have an overview of the data exploration we are about to learn more about this week.

This article is from the free online

Data Visualisation with Python: Bokeh and Advanced Layouts

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now