Skip to 0 minutes and 4 secondsHi, Tobias. So how did you enjoy this week's discussions about big data and the stock market? Hi, Chanuki. Excellent, it has been really great. It was wonderful to see all the fantastic comments on the platform, and it was really exciting. Great. So like last week, I've collected a few questions which learners were bringing up. So shall we get started? That sounds great. Go ahead. OK. So a few learners were wondering, why is it necessary to investigate phenomenon that seems to be obvious? I mean, is there any value in such analyses? I think that's a really, really good question. The question is, what is really obvious, right?

Skip to 0 minutes and 40 secondsI mean that's something which very, very soon becomes a very, very complicated question if you think about scientific advance in general, right? I mean, in a particular context of the stock market, I've seen some questions on FutureLearn where basically people were arguing, oh, so basically, searches which are financially related might be related to subsequent market moves. And that's actually obvious, isn't it? And actually I think it's not really obvious, because we need to think about the question how all these searches are being generated, so why are people actually searching for this in the first place, and also the kind of system we are dealing with.

Skip to 1 minute and 21 secondsI mean, we are presenting the learners throughout all these weeks with examples about human behaviour, and human behaviour in general is a very complicated thing, because people are not necessarily behaving in the same way tomorrow as they did today. But also in particular the stock market, that's a system where it's particularly difficult to say something about subsequent behaviour, because there's a clear incentive to basically exploit any kind of findings which might give you a clue about subsequent market moves by any kind of professional organisations, traders and banks. And so that's very, very, very complicated. And any kind of pattern that you might find also might not last very long because people are adapting and they're actually incorporating this phenomenon.

Skip to 2 minutes and 14 secondsAnd so to basically come back to the origin of your question, so basically what is really obvious? In the area of stock markets, there's not really anything obvious. And even in the general area of studying human behaviour, there might be things which are on first glance obvious, or at least they appear to be obvious, but then to really find quantitative evidence for a phenomenon which you have hypothesised or which you have basically thought about by experiencing or observing the world around us, then there is still an important step to go, and basically to carry out a rigorous analysis and find actually evidence which really suggests that this pattern exists.

Skip to 3 minutes and 7 secondsAnd the question how to deal with it in the future is something in addition to that which comes on top of all of this. But it's really, really not obvious that a pattern between what we are doing online or what we are doing with social media networks, in particular, is actually related to any kind of phenomenon in the world. That's the reason why we need to study it in the first place, and why it is important, because it can give us a very, very important insights.

Skip to 3 minutes and 35 secondsAnd as we have seen before, it might really help us to increase efficiencies, so we might be able to say something a bit quicker what might be going on in the real world right now and might give us some clues about subsequent behaviour, but only and only if we actually have carried out an analysis confirming that this particular type or subtype of phenomenon online relates to a particular real-world behaviour. Great. So many of the learners really enjoyed learning about trading strategies. But isn't it dangerous that once someone knows of a successful trading strategy that companies can easily manipulate them? That's another very, very good question, Chanuki.

Skip to 4 minutes and 17 secondsBasically, as I just said, financial markets in particular are incorporating information, new information extremely quickly because there are people who want to make profit based on this information. And obviously here, the slight touch in terms of manipulation is something which obviously comes to mind. If there is a very stable pattern over time and more and more people are becoming aware of the possibility to use, for example, whether people are searching on Google more financially related keywords in one week compared to another, that this might be actually something which is picked up by a number of institutions in the financial world. And basically people start to act on this, as we know happened to some extent.

Skip to 5 minutes and 6 secondsAnd then the question of manipulation comes very strikingly, because basically, if lots of people know that lots of other people, or at least a significant number of other people, is using this information in order to make financial decisions, then there's just temptation to think about the possibility of manipulation, in particular in terms of Google searches. I mean, you could think of maybe setting up server farms around the globe and just generating random searches for financially related content and then basically trying to manipulate this signal. So from that point of view, there's always this risk, and there's also the risk for many, many other platforms.

Skip to 5 minutes and 47 secondsIt's not only the risk of targeted manipulation, but also the question whether the platform is still existing at all, or whether there's a power outage or any kind of changes in the way how we record online information, and also how actually individuals are interacting with these platforms in order to generate material. So as I have outlined before, there is always the need for actually finding adaptive algorithms which are actually able to capture changes over time, and maybe also are able to spot possibilities or risks for possible manipulation. But beyond that aspect of manipulation, we also can clearly say that there are clear changes over time going on.

Skip to 6 minutes and 34 secondsIf you think about the simple examples, or this simple trading strategy, in fact, where we have used keywords like debt or crisis or other, and we have clearly seen that these strategies have been successful during a certain period, so in particular during the financial market crisis, 2008 and 2009 in these two years. But then after that, actually there was a certain flattening in terms of the return curve, if someone were to go back to these slides. And this is highlighting that also over time there are changes in society-- not only the risk of manipulation but also just the general risk of actually society moving on in terms of topics discussed.

Skip to 7 minutes and 17 secondsAnd so this highlights again that a very adaptive approach is needed, and careful consideration is required in order to really incorporate this into trading strategies. In particular, if learners out there are thinking of using this in their daily investment over time, then they should be really, really careful and test their strategies over a long time period, not only in an historic analysis, but also out of sample what they would generate if they would actually make decisions today, and then over a certain time period record how this strategy would have performed based on that. So you touched on this already-- human behaviour just seems so difficult to predict.

Skip to 7 minutes and 59 secondsSo people can sometimes act irrationally, and as you mentioned before, people can easily change their minds. Other than stock markets, in general, how do you go about predicting human behaviour when it just seems so difficult? Yeah, that's a general challenge that is any rate. I mean, we are not dealing with a natural system, so to say, where basically this is governed or ruled by some physics laws, how matter behaves or how atoms are clustering together.

Skip to 8 minutes and 33 secondsAnd so we are generally facing the problems of dealing with systems which are, on the one hand side, aware of any kind of analyses which are carried out about them, and on the other hand, the second issue is that they might also be able to take on board predictions you make about the system. So it's not like I'm predicting the weather is going to be better tomorrow and the weather is not caring and basically it's raining anyway, and the sun will come out one week later. So basically, as soon as we predict something about human systems, then this might affect the way how these humans as a group, not necessarily individuals, are behaving in the near future.

Skip to 9 minutes and 22 secondsBut also there, we need to differentiate. Basically there are differences between systems we just talked about, stock markets, where there is a high degree and motivation to incorporate any kind of predictions, and others where we would maybe wish there is an effect of incorporation. Basically if you think of traffic jams, particularly the one outside of the office which we try to avoid in the evening and in the morning when we come to work, and basically if I'm going to predict there's a traffic jam tomorrow morning outside of the office, then this is unfortunately not really going to work.

Skip to 10 minutes and 0 secondsAs we said before, basically, given that there are certain restrictions in daily life which govern our everyday routines and our interests, there are many, many phenomena out there where basically these predictions would also have a huge impact in terms of removing the pattern. So the bottom line of all of this is we actually need to be very aware and take into account in our analysis either that behaviour might be changing, as I mentioned before as a response to the other question, some adaptive algorithms, and further on in this course we will see a few examples.

Skip to 10 minutes and 36 secondsOr we need to actually very clearly identify examples and see in these examples in which way your behaviour might actually not change, because basically either it's a question of convenience or it's a question of daily routine, which is really, really hard to change, or it becomes just subject to human nature, basically. We know in many, many examples that we are driven by habits. And basically these habits also to some extent reveal a lot about ourselves.

Skip to 11 minutes and 5 secondsIf you think for example about credit card purchases, and there are very nice examples out there, when you basically anonymise these data sets and you try to reidentify an individual based on what this person has purchased over time and where, basically, so the locations, too, and if you take this anonymised data set and you just take three or four observations in time and in space, so meaning that you maybe went to a coffee place and bought a coffee for a certain amount, and maybe somewhere else you had lunch. And basically three or four observations in time and space are enough to re-identify you with a very, very high probability.

Skip to 11 minutes and 45 secondsAnd just these examples highlight that there are many, many aspects about daily life where it is really, really difficult for us to change, because we are driven by habits. And so there's an entire spectrum of human phenomena which we try to analyse and predict. Some of them might change, and then we need to take it into account, and some others are actually quite impressive, stable patterns throughout time. Great. So finally, we've all seen that the learners are making excellent progress in R. Some people are having some errors that come up, and I'm just wondering, can you give them some advice? Oh, right, right, right.

Skip to 12 minutes and 21 secondsIt's really great to see that there's so much progress and that people are also enjoying it, to learn how to get basically their hands dirty in analysing data and doing their first steps in analysing data about human behaviour online. So possibly, it's an extremely natural phenomenon, I can say, and you can hopefully confirm, that you get errors while you are actually trying to put analyses together. So it's not something to be scared of. Basically it's just part of the process. You are writing a line of code, and there's maybe a typo or there's maybe something missing or something else, and this produces an error. So maybe have a quick look what is actually going on in this line.

Skip to 13 minutes and 8 secondsDoes it make sense to you? If not, and you get this error message, and one very, very helpful and useful thing is actually to Google this error message, because there are so many people out there, a huge community of programmers who is asking questions, but also who is answering questions on many different platforms on the internet. And you will be surprised, even if you get a very, very cryptic error message, there are many, many hits which Google, for example, gives back to you and suggests pages to look at.

Skip to 13 minutes and 39 secondsAnd there are even discussions amongst programmers and new people who want to learn to programme what possible solutions might be, or in the first instance, what actually the problem is, to understand a little bit better, because sometimes these errors messages don't necessarily mean so much to you in natural language, so to say. So basically the advice is not to be scared, basically to Google the error message, and basically take any feedback on board. And then it's also easier to describe what the error in your code might be. And then you can also express yourself better.

Skip to 14 minutes and 14 secondsFor example, if you write to others on the FutureLearn platform in the comments section, and they might then be able to help you a little better. Great, thank you. So it's been another great week. And great chatting with the learners again. So please keep those questions coming in. So I'll see you next week. Absolutely. Next week will be exciting. So we will dig into the issue of crime, and what actually human behaviour centred around crime can actually-- or in which way we can analyse it, to be precise, and how this is actually feeding into some new ways how the police might be able to fight against crime. And there are lots of new packages and algorithms out there.

Skip to 15 minutes and 0 secondsSo that's something we will explore next week. I hope you will all enjoy this. And we are looking forward to all the questions which are coming back and all the activity on the platform. See you next week.

Week 3 round-up

In Week 3, we began to explore how big data might help us understand and even predict behaviour in the stock markets. Here’s a brief summary to help you prepare for Week 4.

Gene Stanley talked to us about how he and his colleagues use big data in combination with approaches from physics to help us understand infrequent but catastrophic stock market crises. You also learnt how data on information flow via the Financial Times, Wikipedia and Google can be linked to trading patterns in financial markets. Finally, we described our own findings which suggest that changes in searches for financial and political information on Google and Wikipedia may have contained early warning signals of stock market moves. Please do heed our warning to be very careful if you’re considering trading yourselves however!

You’ve come up with some excellent suggestions of other data sources that might offer insights into stock market movements – well done. We’ve also seen some great discussions of what might make certain kinds of collective behaviour easier to predict than others. We’re really delighted to see you all getting such a great grasp on this material.

Keep up the fantastic progress with your own analyses of Wikipedia data in R too! We know how useful these skills are in a wide range of areas, and so it’s particularly exciting for us to see you all picking this up.

You’ve been doing a brilliant job of helping each other with any error messages you’ve received while working with R. To make it easy for others to help you find the problem in your code, it’s always a good idea to detail all of the commands you typed in for this exercise before the error occurred, as well as the exact error message R has given you. You might also be surprised at how good Google is at decoding error messages in R and offering useful advice, if you try just copying the error message and pasting it in as a search query – it’s quite possible that many others have seen your error message before.

We very much hope you’ll find that you can build on the practical skills you’ve learned here following the course. Enjoy this week!

Share this video:

This video is from the free online course:

Big Data: Measuring and Predicting Human Behaviour

The University of Warwick

Get a taste of this course

Find out what this course is like by previewing some of the course steps before you join: