Skip to 0 minutes and 0 seconds If the computing power explosion and this digital evolution that we talked about were the kindling, the firewood, the fuel for this data explosion, the cloud was really the spark or the match that really lit the explosion off. And the two biggest areas that the cloud have impacted, as it relates to the Data Explosion, are the cost of storage of data and the cost of processing that data, the cost of running that computer processing power that we talked about. And we’re going to talk about both of those. But let’s start with the storage. And as a transition there, let’s reset and relook at this graphic that showed the volume and the velocity of this data in this data explosion.
Skip to 0 minutes and 45 seconds Remember, by 2020, it’s forecasted that we’ll have 45 zettabytes worth of data globally. And that data needs to live somewhere. And historically, the growth in data was bound by what we could store locally. We didn’t have this concept of a cloud; we had hard drives. And you were bound by what you could store on your hard drive. So let’s harken back a little bit to that world. And this, I think, is a fantastic advertisement from NorthStar Computing Company that was in Creative Computing Magazine in 1980. And this was the absolute cutting edge of hard drives and computing power. And you can see they’re hyping this hard drive over here with the circles around it.
Skip to 1 minute and 29 seconds It is a cutting-edge, 18-megabyte hard drive. In 1980, that 18-megabyte hard drive cost $5,000. $5,000, in 1980, was a lot of money. And that was as far as you could go, right, so the explosion of data was constrained by that. You couldn’t have an explosion more than that, because that’s all you could store. And it was really expensive to store it, at that. Well, that has begun to change, as we all know today, with things like Dropbox. You could go to Dropbox, give them your email address, sign up for a free account, and get two gigabytes for free in the cloud. That’s 110 of these hard drives for free.
Skip to 2 minutes and 6 seconds So data has basically become free; data storage has basically become free. This graphic does a nice job of showing that evolution from local storage to cloud storage. As you can see here, only a few short years ago, less than 5% of the data was stored in the cloud. Everything else was stored locally. The forecast is by 2020, 40% of the world’s data is going to be stored in the cloud. What’s interesting is during the time period where we saw that exponential rise in the amount of data being produced and stored, we saw a commensurate decrease in the cost to actually store it. And the cloud was playing a big role in that decline in the cost to store data.
Skip to 2 minutes and 47 seconds And in this graphic here, we actually show that decline. So from 1980 to the current, what’s actually going on with the cost of data storage? So this graph shows the cost per storing a gigabyte worth of data. And again, it’s on a logarithmic scale. So this trend which looks a little bit linear is actually massively, exponentially, declining. If you go back to 1980, the cost of storing a gigabyte worth of data was equivalently a million dollars. Whereas today, it’s basically free. If it was really expensive for people to store data, they would store less of it. People would be much more thoughtful about OK, I have a thousand things I could store.
Skip to 3 minutes and 27 seconds But I really think I’m only going to need these ten, so I’m just going to store these ten, because it’s really expensive for me. Instead, because it’s basically free, people store all the thousand; they store it forever. Because maybe it’s going to be useful for me. Remember that stat that less than 1%, in fact, 0.05% of the data had been analyzed?
Skip to 3 minutes and 43 seconds That’s why: people are just storing massive amounts of data, because it’s essentially free. So this democratization, if you will, of storage and storage costs, making it essentially free for anyone to store massive amounts of data, has really fueled this data explosion. There’s another couple of places where democratization has happened, if you will. It used to be only big corporations could afford hard drives to store lots of data, or afford the processors to actually crunch that data, or even buy the statistical software that could draw insights from it. That has fundamentally changed over the past decade. So we just talked about storage. And storage is now democratized. Anybody can go sign up for Dropbox or Amazon Web Service or Hadoop.
Skip to 4 minutes and 32 seconds All of that is open source. And anybody can go do that and store a bunch of data. But also, they can analyze that data now, with tools like R and Python, all are open source. Anybody can go get it, from the smallest company, to the individual, to the big company. All of that is now democratized. And anybody can go get, cutting-edge, best-of-the-best statistical software for free. And finally, you can go rent, now, processing power. So instead of having to buy these processors and big computers yourselves, I can go to Amazon or Google or any number of places, and rent servers in the cloud.
Skip to 5 minutes and 10 seconds So now, from the comfort of my own home, I can store as much data as I want, basically for free. I can use the best statistical software on the planet, for free. And I can rent processing power, almost for free. You bring all that together and now say, the smallest companies that are out there, that are trying to build and compete with these big companies, now have access to all the same tools. That is really new, and I think the impacts of that are just now starting to be felt.
Democratization and a look forward
There’s a good chance that open-source software and online tools such as Dropbox are a part of your digital life.
In this video, we examine how the emergence of these tools has also played a role in the Data Explosion.
© Kogod School of Business, American University