Server Farms, the Internet’s Space, and Irretrievable Data Loss
Two interesting and related articles I recently read that got me thinking about massive redundancy, power consumption, and the move toward less power-needy CPUs. The move toward more efficient CPUs means less focus on clock rate, which is a fundamental shift in the way we’ve looked at R&D for CPUs historically.
Today Google rules a total database of hundreds of petabytes, swelled every 24 hours by terabytes
Source: Wired’s article on cloudware
And I’m sure this won’t seem like much in the near future. There’s a 1-terabyte hard drive coming out soon or possibly its already out and I’m behind the times yet again.
That’s one exciting thing about this field of technology: constant, 24 hour, ceaseless innovation and evolution. Just 20 years ago I had a Commodore 64. It was called a 64 because it had 64K of RAM. For a refresher course, after Kilobytes comes Megabytes, then Gigabytes (my MacBook hard drive now is 80GB), then Terabytes, then Petabytes (Google houses approximately 200 petabytes), then Exabytes. And I don’t even know (right now) what is after that, but I’m sure we’ll be there sooner than we think. So much information.
What delights me about so much information, so much data, is the chance to analyze the data for trends never before realized in the history of mankind. From the mundane (what time do most people watch videos of dogs skateboarding) to the more business-oriented intelligence (how many times does a person look at a product before they buy), this stuff is interestng. And, to get a bit more lofty, perhaps it reveals that as a global community, human beings aren’t all that different from one another. Same lumps of flesh with different nooks and crannies and variances, but basically the same computers in our skulls…
And this other article is about the dangers of data corruption. What would happen if Google lost its massive redundancy and lost all of that information? Sure, the web could be scoured again, sure this information is hidden elsewhere. But what about a massive power failure and critical hard drive corruption and, what the founder of CouchSurfing.com describes as “the perfect storm” of irretrievable data loss. Here is his letter of regret about how two negligent System Administrators managed to bring down a business with 90,000 registered users overnight:
TechCrunch DeadPool: Couchsurfing Deletes Itself, Shuts Down
Imagine if the web went down, even for a day. How would you look up a phone number? Find driving directions? Buy a new book? Read the news? I guess you’d have to go take a walk and tend to your garden.
