James Davenport

James Davenport

Professor of Information Technology at the University of Bath, in both Mathematical Sciences and Computer Science. Is on ISO/IEC JTC 1/SC 42/WG 3: Trustworthiness of Artificial Intelligence.

Location Bath, UK


  • Certainly both good comments. But I think these aren't the only problems: keep looking!

  • By and large, yes. But a bad experiment poorly reported can have pretty negative consequences: look at the negative publicity for MMR because of a seriously flawed experiment. And that's far from the only case.

  • Nice find. But at least the headline is 'may': we often see much more definitive headlines :-(

  • That's an interesting observation - any supporting data?

  • Thanks for pointing that out. Proofreading is rarely perfect!

  • Certainly we can't make a 100% determination. These data are often used by social scientists, in my view somewhat dubiously. See https://www.nature.com/articles/s41586-022-04997-3 for an example. In my view, if anything this article actually shows that people whose friends have expensive mobile 'phones tend to have expensive mobile 'phones.

  • Sorry about that: Loughborough seem to have put this in. Here's an alternative link to the same material: https://nucinkis-lab.cc.ic.ac.uk/HELM/HELM_Workbooks_31-35/WB35-all.pdf

  • James Davenport made a comment

    If in doubt, PUT brackets in!

  • Good that some of you have tried this - I hope the rest of you also tried it. Note that there ISN'T a winner. This is perfectly possible in real life as well, and is why there is no perfect method of voting, a result generally known as the Condorcet-Dodgson paradox: see https://en.wikipedia.org/wiki/Condorcet_paradox .

  • Indeed, and because they have the intersection of two sets, they could well inherit the biases of both. A cautionary tale!

  • Glad you enjoyed it. Yes, Python is relatively easy as computer languages go (said with feeling!)

  • 5 is 'True', because "False" is a string, and False is a boolean value.

  • Really good idea to include units: see https://en.wikipedia.org/wiki/Mars_Climate_Orbiter .

  • Indeed. English is particularly ambiguous, as here, because 'bite' can be both a noun and a verb.

  • That's a question for the OU (or any other university you'd apply to) rather than us, I'm afraid.

  • Patrizia is certainly right to raise the concerns over privacy. And just saying 'blockchain' certainly isn't the answer.

  • Good comment about "statistically significant". But there isn't a simple 'magic bullet' here: the American Statistical Association has a good note here: http://www.amstat.org/asa/files/pdfs/P-ValueStatement.pdf. A (far too) common test is "p<0.05" meaning "has a less than 5% probability of occurring by chance". But Data Science and computer power make it...

  • Indeed, Data Science requires a range of skills, so good luck all!

  • As the old joke goes "He uses statistics the way a drunken man uses a lamppost: for support rather than for illumination"

  • Thanks Tom.

  • I think Laura has an important point here. The old phrase in computing was "garbage in, garbage out" (abbreviated to GIGO) and that's probably appropriate here.

  • Good, ambitious, questions. There will be several legal/ethical questions around collecting and analysing such data, but that shouldn't stop you from trying.

  • Nice motivation. The sport that's made the most use of big data/analytics is probably baseball: see for example https://hbr.org/2019/07/what-baseball-can-teach-you-about-using-data-to-improve-yourself .

  • The most important thing about a convention is that it should be applied consistently. The bigger the project the more important this is.

  • @NiallBuswell : this doesn't quite work. What happens if I say N/N/N/Y/Y - I get no drink but with both milks.

  • Also, interdisciplinary teamwork - it requires a team with more skills background than one person normally has.

  • To all, but in response to this remark - that's one reason we built this course, to help people realise their gaps.

  • A good set of motivations so far - broad, but that's to be expected, as AI and DS are very widely applicable.

  • All good comments here.

  • Days of the week, certainly. Hence the original design certainly has a problem, as several jave mentioned. But also holidays (which can differ by country, in the event of an internationally-oriented website).

  • Good point - it could be argued that Facebook has hijacked the word "friend"

  • Good point. There are pros and cons to suppressing a very large item. I'd have been tempted to use the 'broken y axis' technique in this case. But hindsight is always better!

  • Appreciating that you have a lot to learn is part of the journey, and much better than not appreciating it.

  • Sticking with the coffee theme, there's a popular article on the "does coffee stunt children's growth" myth at https://www.livescience.com/coffee-does-not-stunt-growth.html : there's a correlation between coffee and osteoporosis, but that's because coffee drinkers tend to drink less milk.

  • Indeed, and without a great deal of care, AI can easily replicate, and quite possibly exaggerate, existing biases.

  • @MichaelMorehouse Do you want to drop me a mail (masjhd@bath.ac.uk) to take this one further?

  • @MichaelMorehouse Indeed - I was just starting to re-read this and had the same thought. But BMI is still in use, e.g. for Covid prioritisation: https://www.liverpoolecho.co.uk/news/liverpool-news/invited-covid-vaccine-because-nhs-19857990?utm_source=nsday&utm_medium=email&utm_campaign=NSDAY_180221

  • And a more general lesson is that there needn't be a clear winner. This shows up in PR voting as the Condorcet-Dodgson paradox: https://en.wikipedia.org/wiki/Condorcet_paradox

  • Absolutely right that there are a lot of assumptions, many of which are driven by availability of (quality) data. The technical phrase would be that we are using insurance data as a proxy for accident data.

  • And I'll be looking at today's comments

  • The fragmentation issue is an interesting one. A lot of health research (e.g. on alcoholism) comes out of the U.S. Veterans Administration because they have essentially perfect tracking of their patients across multiple hospitals etc.

  • Sally B: good points. There is a lot of work in "medical ontologies" (read 'structured vocabularies') to ensure that the same terms are used, but it seems to me, as one who follows ontologies but isn't a doctor, that these are of limited scope. "Cause of death" for example, is one where the principal cause is well-structured, but secondary conditions tend to...