Tools, Techniques and Technologies used in Data Science

Watch this video of Adam Amos, on the tools and best practices in data science.
ADAM AMOS: So what tools should a company use to start getting into data science or even scaled data science? Well, to get started, there’s one thing that everybody’s got in their computer right now, and it’s good old fashioned Microsoft Excel. Everywhere I go, every company has the ability to operate Microsoft Excel. And by nature then, it makes everybody in the company a data scientist. Excel is a great place to start, and it’s a great tool to build on. There’s a raft of other tools out there that are highly specialized, but to get going, particularly from earliest student days, all the way through, even to do a lot of jobs, Microsoft Excel will be your great ally.
The tool I recommend for aspiring data science professionals is gonna be Python. And the reason I’m recommending a software programming language rather than a specific prebuilt tool is because you can take Python and build the tool that you want at that moment and there’s really no barrier to entry. You can go right now and go and download Python IDEs or development environments and get started right away. And you can build a tool that will be able to be turned into a distribution that you can then send around and be able to scale out your impact of your data science in your data analysis.
So best practices for implementing data science in organization is very clearly outlining the goal what will the data scientist is trying to achieve. Very often I encounter data scientists in large organizations that are swimming in the data lake, I like to call it, but it’s more like being a drift on the data lake not really sure what they’re looking for and not really got a clear idea of what the complete dataset looks like. And as a result, very rarely achieve a significant result in any meaningful amount of time.

We’re entering a new world in which data may be more important than software.
Tim O’Reilly
