Skip to 0 minutes and 1 second I’m David Wallom. I’m an Associate Professor here in the University of Oxford’s e-Research Centre, and lead two different research groups– Energy and Environmental Informatics and Advanced e-Infrastructure and Cloud Computing. Open data itself is actually more a philosophy. It’s this idea that data should be freely, easily accessible without any restrictions on its reuse, on its sharing, or basically anything that you want to do to it. Open data should be findable, accessible, interoperable, and reusable. And in some ways, the need to adhere to those principles is one of the things that many people say will actually mean that open data will have true value. One person’s value is always the foundation of another person’s knowledge.
Skip to 0 minutes and 58 seconds So what you can do in many cases, is get the primary value out of a piece of information, so energy consumption, metering, billing. But what can you then do with that data as the next step? Can you actually think about how you connect together energy consumption with behaviour? And from that point of view, how different people actually class energy consumption? What do they count as important in their own behaviour within either the home, the business, or elsewhere. Whenever you have a piece of open data, there is undoubtedly a transformation that you’re going to need to be able to do on it.
Skip to 1 minute and 35 seconds As part of that, ensuring that you have a good understanding of the tools, the processes, that you’re going to use on that data is incredibly important. And you should be utilising software management best practises to ensure that those tools are captured, be that putting them in a repository, be that making sure that they’re actually coded in a manner that means that somebody else is going to be able to use them. Discovery is still a great problem. There are a number of well-known places– the Open Data Institute, CEDA Archive for Natural Environment Research Council funded information. But actually, when starting to work on a broader scale, it becomes more difficult.
Skip to 2 minutes and 16 seconds There is no standard around where data repositories are on an international scale. So from that point of view, an earlier project of ours that was looking actually at the impact of deforestation and whether it could be traced to end consumer products, there we actually had to dig into to try and find open data on an international scale. A great problem around the idea of data more generally, in particularly open data and research, is if you spend or invest an awful lot of time in creating an experiment, do you really want to actually open it out and let other people who haven’t invested in it actually then have easy access?
Skip to 2 minutes and 42 seconds There we’ve got now the idea that actually data can we publish, can be attributed, in exactly the same way that scientific publication is. And that comes down many ways to this idea of licencing data. I mentioned earlier about what makes open data. And from that point of view, the most important part is not, well, I shoved it up on website and then someone could go and look at it. It’s I put it somewhere where someone could find it. It had the metadata with it, but it also had licencing information and attribution requirements. But the most important thing is actually when people go to start using that information, do they have the provenance?
Skip to 3 minutes and 21 seconds Are they able to say, actually, that this data is exactly the data that was delivered, that it hasn’t been altered in any way?
A scientist's perspective: open data
Watch David Wallom, Associate Professor at Oxford e-Research Centre, explain the philosophy behind open data and how it can be applied to research.
Are you aware of any projects which use open data? Share these in the discussion below.
© University of Reading and Institute for Environmental Analytics