Skip to 0 minutes and 5 seconds It’s a great pleasure to be here today with Michael Batty, the Bartlett Professor of Planning at UCL and you’re also the chairman of the Centre for Advanced Spatial Analytics. Thank you very much for taking the time. Can you explain us what CASA is, the Centre for Advanced Spatial Analysis? Well, we were set up about 20 years ago to really develop geographic information systems technologies. GIS is software that really relates to mapping and UCL really didn’t have a research presence in this area, so consequently a group of academics in different departments, mainly in departments such as transport, built environment, geomatics, and so on decided to set up this centre. So our mandate, really was to develop research in this area.
Skip to 0 minutes and 57 seconds And then since then, we’ve developed very strongly – we have very strong visual emphasis, so everything we do is communicated through maps and virtual realities, augmented realities, and so on. So that’s grown ever bigger in the last 20 years, as computers have improved and technologies generally have improved in this area. And we also do a lot of simulation and modelling, so it’s very cities focused. We’re not by any means just GIS, but we’re very cities focused and you could say that we deal with computer techniques and technologies in relation to cities. So this sounds really fascinating Mike, and I had the pleasure to visit you several times and I even spent some time in the past in CASA.
Skip to 1 minute and 39 seconds So can you explain all the audience which kind of data sources you are collecting from and how this is actually turned into applications and use? Yes. I mean, geographic information systems technologies really depend on map data, so the really biggest data sets that historically we’ve used tend to be automated map layers, so for example, the geometry of the environment – the Ordnance Survey in Britain, for example, basically for the last 25 years really, have coded the streets and the land parcels and things of that sort, digital terrain models and so on. So to some extent, those data we take for granted. That’s the kind of physical context.
Skip to 2 minutes and 23 seconds The other big data sources historically have been really the kind of gold standard social and economic data sets, which is the population census. And this tends to be– most western countries have a population census that’s carried out every 10 years. It’s a very exhaustive full sample, really, of the population in that sense and the characteristics. In the last 5 or perhaps 10 years, lots of new data sets, which are collected largely by the public sector but some by the private sector, have actually come on-stream such as house price data, employment data, and things of that sort. Now that traditionally is the data we’ve dealt with.
Skip to 3 minutes and 10 seconds Much more recently within the last 5 years, one of the main areas in our world that has begun to change is the embedding of sensors into the environment, which are actually sensing what’s happening. And these sensors can either be physical sensors such as feeding into computers what’s actually happening in terms of weather, pollution, things of that sort, or they can be sensors which are activated by the population such as smart cards of various sorts which are linked to transport systems and that sort. Now, the thing about this data is that it’s largely streamed in real time. It’s captured in real time and streamed in real time, so the temporal dimension has come to complement and even outweigh the spatial dimension.
Skip to 4 minutes and 3 seconds And so consequently the data is very big. The data is big because it lasts as long as the sensor is delivering the data. And in that particular context, this is why we tend to refer to that kind of data as big data. I mean, the actual precise definition of big data is a little bit blurred in a sense, but we tend to think of big data as being streamed data in some sense either from the point of view of devices which are embedded in the environment streaming it or indeed, devices which are activated by the population in that sense. This sounds really, really exciting, and I know you’re doing really fantastic work, at CASA.
Skip to 4 minutes and 44 seconds Can you give us one example you are really most excited about in terms of how this change to scientific understanding of cities. Yeah. I mean, one of the big areas where big data is streamed, etc. relates to transport, and in particular, the project that we’re most excited about, I think, is our work with the public transport data in London. Of course, as you probably know that most people probably 85% of people using public transport in London and greater London that is, the GLA area, are using some kind of smart card to activate their travel basically.
Skip to 5 minutes and 26 seconds So if you take the Tube, for example, a so-called Oyster card is a stored value card, so you put money onto it and you tap in if you’re using the system then tap out basically. So in other words, your Oyster card provides a time stamp of where and when you actually start to travel and end travel in that sense. So this is delivering lots of interesting data about the demand for travel. Now, that pertains to the Tube system of which there’re about something like seven million tap in tap outs per day, the bus system, about nine million tap ins basically. Buses are perhaps bigger than the Tube in that sense, and then overground rail as well.
Skip to 6 minutes and 10 seconds You can use these cards on the overground rail. So these are very big data sets where every single transaction, every time a person travels basically then there are normally two records – the tap in and the tap out – for the journey, and this data is simply streamed. It’s not streamed in real time. It’s captured in real time, but we get it as a data set from Transport for London. They basically deliver it to us. Now, the great thing about that data is that it enables us to do things we’ve never been able to do before. If there’s a disruption on the transport system, you can actually see that in the data.
Skip to 6 minutes and 50 seconds We calculate how many people can be affected and how they’re affected. So for example, we have three months’ data for last year, which is about – three months is round about a billion tap in, tap outs, and we have the three months’ data over the Olympics. And in July, we’ve looked at one event where the Central – the Circle line went down for four hours, and we could calculate that 1.23 million Oyster card users were actually affected by this closure.
Skip to 7 minutes and 23 seconds And they’re affected by either spending more time travelling on the system and having to divert around this thing or, in fact, switching to buses basically in that sense, so that’s the kind of thing we can actually measure from this data.
Interview with Michael Batty
How do you analyse a billion records of Oyster card tap-ins and tap-outs?
Tobias talks to Michael Batty, Chair of the Centre for Advanced Spatial Analysis at UCL, about how work at CASA is beginning to focus on large data sets streamed from huge numbers of sensors in real time, describing everything from the weather and pollution in a city environment, to how its inhabitants are moving from one location to another.
Michael Batty is Bartlett Professor at University College London, where he is Chair of the Centre for Advanced Spatial Analysis (CASA), and a Fellow of the Royal Society. He was awarded a CBE in the Queen’s Birthday Honours in 2004 for services to geography. In 2013, he received the Lauréat Prix International de Géographie Vautrin Lud, often described as the Nobel Prize for Geography.
An ‘Oyster’ card is a smartcard which can hold pay as you go credit, Travelcards and Bus & Tram season tickets in and around London.
You can watch the whole of Tobias’ interview with Michael on YouTube (24:01).
© Warwick Business School, The University of Warwick