So I think the main rules, if you’re setting up a data team will roughly fall into the, I guess, three columns of analyst, engineer and scientist. And then you have your varying levels of like seniority in there. If you’re first starting off building this team, one thing that I’ve seen is that companies can get very excited. But they’re also a little nervous and don’t really know what to expect. So they’ll just hire in a junior or an intern. And that makes things extra difficult because that person might not necessarily have the experience to say what the problems are, or even know the best way of doing something to get value faster.
So what I would suggest is, if you do want to start the journey is actually make a bit of an investment in a commitment and maybe not necessarily data science, but in a senior data analyst. And those kinds of people are the ones that understand the business questions. They might not necessarily be modelling anything, they might not be predicting anything, but they’re very good at understanding what the data is telling them. So looking at that historical view, understanding how the data can connect and what you can actually do with it, which can really help bring up a lot of how do people say the the low hanging fruit, right? But that kind of stuff isn’t necessarily replicable all the time.
And it’s a bit of it, like really manual work. And that’s when you can start thinking, Okay, well, this has been really valuable. Do I need to do this all the time? What other work can I do? And that’s where the kind of engineering or data structure or data architecture of it comes in, where you want to get the data in a way that people can easily access? That it’s clean, that it’s reliable. And then you have your data scientists which are more about Okay, once this data is there. What kind of models can we create to not only help us understand what happened before but then make decisions for the future and make that replicable?
and constantly kind of checking itself, validating itself and changing as we learn more and more and more, and taking away a lot of that manual work from the initial analyst whose job it was to kind of investigate what’s actually possible. In terms of what I or let’s say any company can offer a data team. I’ve learned that data scientists are a unique kind of personality. And I think it’s been really interesting how data science as a field, and as a career has really been talked about. And we had that question earlier about hype.
And I think a lot of it is data scientists are working on really fun cutting edge technologies, you know, neural nets, support vector machines, what have you. But the reality of it is that in most businesses, and in most organisations, something very basic or something very simple that can be explained and run very, very quickly is the most beneficial thing. And I think you’ll see a lot of data scientists will come in and start being a little disillusioned or a little disappointed or a little bit bored at what they’re doing and because the hype is so high and all companies want them, it’s very easy for them to jump around and try something new.
So I think it’s really important to understand what what motivates data scientists, and from what I’ve seen at the core is they’re people who really want to learn, and they constantly want to be learning and challenging themselves. So whenever I try to hire someone, like I’m very clear about kind of the work that we do do so that there’s no miscommunication or lack of like expectation management. But along with that, what we do as a company is we’ve actually introduced internally, we call it learning time, but it’s giving everybody half a day a week to work on anything that they want in any way that they want.
So if during the week they’re working on something, and they have to use something pretty basic, because we need something quick, that time will give them the ability to actually use that data in a more fun model, let’s say so they can learn how to use it, what the problems are, where the pitfalls are. And I think it helps both sides because they’re learning. And then they’re sharing it with the rest of their team, like once a month at a lunch and learn, but also for the company, it’s a low risk way of testing out new technologies and increasing accuracy and improving the results that we’re giving to clients.
One thing I would say as well is during the interview process, one of the biggest highlights of the company of TV squared, I think, and it is it is, for me, at least is the people that we work with are all very, very smart and very, very hardworking and love helping and sharing their knowledge.
So during that interview process, I would suggest that whoever’s coming in, or the candidate actually just talked to a bunch of different people in the interviews, make sure that it’s going to be at least two different people each time, so they can then ask the questions of what’s it like to work here and just get a feel for the kind of environment that we have and the kind of people that we work with because I really think it’s one of the strongest values that a company can have is having great smart people who want to help and share and collaborate. And I think that’s a really important thing for data scientists.
Typically, when interviewing someone for a data science position, I would suggest having someone who’s maybe a little bit more oriented on the the career path and the actual motivations of the person and then having someone who can ask a lot more technical questions. The technical questions are a bit difficult because people can be working on different let’s say, people can be very specialised in certain modelling or certain tool sets. And it’s good to kind of get an idea of what that is, but that kind of stuff can be learned and adapted to quite quickly. In general, the kind of questions I want to know is how this person thinks and what it is that they want to do.
For instance, I’ve had roles out for let’s say, a data engineer, and I’ll ask them Okay, great, like what do you see yourself doing like on a day to day basis, what actually makes you really happy? And then it comes out that actually what they want to be as a data scientist because they really care about the analysis and working with the models and understand the best one. And so at the end of the day, what they’re applying for isn’t right.
And you already have a mismatched expectation, which can be really, really unfortunate, like six months down the line, when it comes to asking them about the way that they think, it’s really about giving them a problem to solve it can be completely conceptual. How would you tell us what the next best thing to do is with this particular campaign or something? And if they immediately start going, Well, I can use a random forest model or I can use a naive Bayes to figure this out. You immediately kind of think, okay, that’s not really not my question. My question is, how would you go about solving this? not what particular model would you use? It’s about getting them to think through.
All right. In order to solve this problem, I need to understand, like, why? what the impact of it is, what data do I have? how can I access that data? how clean is it? and how can I work with it to get to that final answer? And one thing I really like asking about, which doesn’t happen that often I find, when I see other interviews is asking them how they would validate their work? And I think the validation of the numbers tends to be completely, like forgotten about, you get so excited. You’re like, yeah, I’ve got this really great model, let’s just put it into production.
But it’s important to know if it’s of any value whatsoever, and how you would go about doing it. And you have your basic answers, like having a test set or a holdout set. But for certain problems, it usually goes beyond that, right? It’s about balancing between how many things you get right, versus how many things you get wrong, and if if that’s the right balance of the kind of risk that a company should be willing to take. And those kinds of questions, less about the technical this function does this but more about the way that they think.