Skip to 0 minutes and 8 seconds So Justin, you’ve kind of got a government organisation or another institution, you’ve decided that they might have some useful data sets, what do you kind of look for when you’re evaluating that data to see if it’s of any use to you? Yeah. So there are a few steps that we’ll go through in order to figure out whether we should on board a source of open data. The first is looking at the characteristics of the data itself.
Skip to 0 minutes and 32 seconds And there are really sort of five criteria that we pay attention to when we evaluate a source of data, it’s the accuracy of the information or what we judge– how accurate we judge the information to be, how complete it is, so the sort of fill rates for the various data fields that are being reported on, the temporality of the data, and by that, I mean issues around the sort of timeliness, the frequency, the latency of how fresh the data is, how often it’s updated. Then there’s the uniqueness of the data. And largely for open data, it’s not going to be that unique. It’s part of the– kind of comes on the tin, in the sense that it’s open.
Skip to 1 minute and 12 seconds So you know, it’s available to anybody. But oftentimes, we’re able to combine a source of open data with another data set we have to create something unique. So you also look at the way that the data might be combined with other data assets that you have to deliver something of value. And the final thing is just the consistency of the data. So that could be the format that the data is presented in or consistencies across the data set. So once you’ve sort evaluated it on that basis, you then look to see what’s involved in actually onboarding and linking that data. So at Duedil, we’re creating the world’s largest source of private company information.
Skip to 1 minute and 56 seconds And we do this by gathering together and linking authoritative sources from not just open data, but all types of data. And so when we’re looking to bring on open data, we want to see how it will link up with these other sources of data that we have and how much of a challenge that will be. So there’s definitely a risk around that, I think. That’s a risk that we try and mitigate as much as possible. And I think part of it is about being kind of open with your users about what the sources of data are and notifying them if there are issues or delays around the delivery of a particular data set.
Skip to 2 minutes and 36 seconds So is open data quality, especially from government departments, improving over time or is it sometimes a little bit sketchy still? Well, that is definitely– it’s definitely going on that it’s getting better and better. So when open data– often when people first get involved in open data, they just publish PDFs, which we like to call paper behind glass. And they’re just saying, we published it. There’s data in that. Obviously, the value of open data increases if the data quality is good, if it’s easily available, and if it’s in useful open formats. And that’s– we’re certainly seeing that getting better. The UK is a great example, gov.uk, and the government digital service.
Skip to 3 minutes and 17 seconds Yeah, it’s been transformational over the recent years where it initially made an impact. And I think another factor in this is younger people more– I’m a little cautious about saying digitally literate– but people who understand the value of this kind of thing are starting to have roles in government, both local government and national government. Even at the cabinet office level, some of this is understood. So I think we in the UK, we’re definitely seeing over the years there’s been a rapid improvement in the quality of the data, which is a shout out for the ONS.
Skip to 3 minutes and 48 seconds The Office of National Statistics have done a great job on that as well, understanding that the quality of the data needs to be good for this to be worth doing. And the quality doesn’t just mean the truthiness of the data or the timeliness of it, but also the usefulness of the formats and the ability to even find what you’re looking for. All these things are improving, definitely.
Checking Open Data sources
As we’ve seen already, Open Data can be highly useful, but only if it’s of a good quality, is sustainable and reliable. How do you check that Open Data is good enough to use in your products and services? What issues do you need to look for? Let’s hear what some of our experts do to check Open Data before they use it…
© Royal Holloway, University of London ¦ Attribution CC BY