Netflix outage a taste of crisis: when hosted applications go bad
When Netflix began encountering database errors in its system earlier this week and its shipping algorithms started to malfunction, many customers were hit with delays in receiving their DVDs. Netflix is a great example of a successful business model that operates in a hosted, or rented manner. Customers temporarily utilize discs and return them for more temporary discs. But what happens when the system goes down? Though less critical, it becomes similar to losing electricity.
And that’s where a lot of applications are going on the web. Consider most of Google’s software, it’s hosted on their servers and run through the web browser. Google Gears promises to run some of those applications offline but its use isn’t widespread yet.
That means everyone using Gmail, and Google’s mainstay, search, lose the ability to do anything without that temporal connection to the company’s servers. Likewise, customers using Netflix quickly lose the ability to view movies regularly if the shipping algorithm stops working normally.
Netflix is taking some positive steps to minimize the impact though, including a credit to the affected customers and regular updates on its blog. In fact, Mike Osier, head of IT Operations just posted an explanation of the outage.
On Monday, 8/11, our monitors flagged a database corruption event in our shipping system. Over the course of the day, we began experiencing similar problems in peripheral databases until our shipping system went down.
. . .
With some great forensic help from our vendors, root cause was identified as a key faulty hardware component.
It’s great that Netflix is being so transparent with its issue because that engenders trust with its customers. But, it’s impossible to avoid the realization that with rent-instead-of-own and hosted services, the customer becomes completely dependent.
Of course, Netflix may be able to minimize that problem by accelerating its conversion of DVD video to online downloads. Then the impact of outages is minimized because transportation time is almost zero.
Related Posts:

