The tips that I’d give myself when I start my data journey, it’s a really difficult question because I would say don’t be scared
digging into it, but I don’t know if being scared might have actually helped me overcome my fear. I know, I’m sorry, I’m getting a bit philosophical, but I genuinely wouldn’t know because I think sometimes when you’re scared of something you become really adamant to overcome this fear that you actually start learning more about the subject and becoming more knowledgeable, and I think that’s a normal part of the evolution. But what I’d say is, definitely don’t be scared definitely don’t be scared me ‘cause I was terrified. The first instance when I had to manage a whole data team and it’s like, “they’re gonna find me out in the next two minutes, and then what do I do?
I’m like, “That’s it, I’m branded an imposter for life”. I would give myself so many tips if I had to go back in time. I think the main one would be, and this is such a constant thing that people say all the time, so it shouldn’t really count, but it’s so powerful, it’s ‘perfect is the enemy of good’.
And I think in data science, it’s a particular problem, because you have people coming in understanding, actually understanding how to quantify what good is and what perfect is and the fact that you can always do something slightly better, if you just did something differently, worked a little harder spent a little bit more time, but at the end of the day, if you’re trying to solve problems for business, you’re just even directionally trying to give them some information that they didn’t have before, which would be better than nothing you wanna be able to do that quickly add value quickly and not spend ages getting something completely right that ultimately, it’ll be too late to actually put into action.
And I think almost every single data scientist will fall into that trap at some point, usually early on in their career, but it’s almost unavoidable. So always keeping that in mind. Also data sciences can look very, very different in different organizations, so one definition of it, just even in terms of the role and in terms of what the company expects and the output is can be very, very different. Like in a consultancy you have the room to actually explore different questions and try to find something that might help the company that you’re working with.
You’re not constrained to one specific question ‘cause if you can’t answer it, and sometimes you can’t, sometimes it’s not possible, you can still try to find something valuable and useful. In a software as a service product, you need to create something that can be replicated and delivered across hundreds or thousands of companies or people. So you are a lot more constrained and the questions that you’re answering the way that you have to answer them are a little bit different, and then even if you’re in a completely different industry like health, accuracy is probably of the utmost importance.
So spending all the time answering this one thing and just iterating and iterating and iterating in order to get the best answer is what you need to do. So it’s funny because you don’t think about it until you move from one company to another, and you’re set in your ways, of… This is what I need it to do. Suddenly it’s, “Oh my God”, I have to take all of these other things into consideration, because this is a completely different environment and the things that we need to solve are completely different in a complete different way.
And I think being very much aware of that is important, especially for companies who want to start their data science arm, is you have to be careful about where you look at to get inspiration, and understand what the right processes or the right questions to ask are and what the right expectations are ‘cause they can vary quite dramatically. For me and maybe it’s ‘cause I have spent a lot of my career working in data quality and data governance, those things haven’t gone away and in a way they’re more important now than they ever were. They often got parked in the too difficult pile and we’ll handle that when we come to it.
But actually the benefit of looking at it up front, and getting the data in a good state and being able to maintain it allows you to do so much more and gives you that certification if you like, to see that the data is accurate and that’s gonna be required with compliance things that already in place like GDPR, you have to be able to explain the answer that an AI model has come up with, well you can’t do that if you can’t guarantee the structure and the content, the data that went in to that model in the first place.
I think there’s a couple of things starting out in the journey, so one to remember that sometimes there can be a really simple solution to what can seem like a very difficult problem. And I’ve had that a couple of times in my career where a very, very simple solution, has actually fixed a very big, big problem. The other would be is politics unfortunately, there’s a lot of politics in organizations and you can choose to ignore them, but for a challenge like data that involves everybody across the organization collaborating, you really need to be able to work within that and make that work for you within your organization.