Jump to content

News Story: Retail Servers/Web Sites Crashing on Black Friday


bigjimslade

Recommended Posts

I know for the past few years servers for the major stores have crashed Thanksgiving Day and on Black Friday. This is a cool story and I hope IT managers will take note.

 

 

http://www.networkworld.com/community/node/20187

Downtime - one reason why we want to "manage" IT Operations

Submitted by Kerrie Meyler on Wed, 10/03/2007 - 1:20pm.

 

 

If you've read my bio, you will notice I wrote a book (actually I was the lead author) - MOM 2005 Unleashed. We're in the final stages of writing the follow-up to that, Systems Center Operations Manger 2007 Unleashed. I mention this because it probably would be worthwhile to post some articles related to operations management - what that means, what it's all about; and potentially to get specific about the Microsoft product (without stealing too much thunder from the book of course!).

 

I'd like to talk a bit here about why unscheduled downtime is a bad thing. Obviously its not a good thing if you can't get to your email or run your business applications when you were expecting to, but it is always interesting to try to quantify why it's a bad thing - e.g. put some hard numbers to it.

 

We can start with a simplified example of the impact of temporarily disrupting an e-commerce site normally available 7x24. The site generates an average of $4,000 per hour in revenue from customer orders for an annual value in sales revenue of $35,040,000 US. If the website were unavailable for six hours due to a security vulnerability, the directly attributable losses for the outage would be $24,000 US.

 

This number is only an average cost; most e-commerce sites generate revenue at a wide range of rates based on time of day, date of week, time of year, marketing campaigns, and so on. Typically the outage occurs during peak times when the system is already stressed, greatly increasing the cost of a 6-hour loss.

 

There are other costs incurred from an outage. Some customers may decide to find alternative vendors, resulting in a permanent loss of users and making the revenue loss even higher than the direct loss of sales. The company may decide to spend additional money on advertising to counter the ill will created when customers could not reach the site. The costs from our example 6-hour outage can thus be far higher than its simple hourly proportion of time applied to an average revenue stream.

 

Another case in point would be a large-sized credit-card processing card company that estimates it would stand to lose nearly $400,000 in direct revenue if they experienced a one-hour operational outage affecting their ability to process credit-card transactions. This number assumes an estimated cost of just over $1.00 per missed transaction, and does not include the inevitable decline in revenues due to a loss of confidence from clients were such an outage to happen.

 

Does this actually happen? Let's look at some real cases. We can look at what happened on Black Friday in 2006, which refers to the day after Thanksgiving in the United States and is the busiest day of the year in the retail sector. On that particular Black Friday, the websites for two very large U.S. retailers (Wal-Mart and Macy's) were unavailable starting around 4:00 a.m. for approximately 10 hours, presumably from overload. While it is possible that the potential customers tried the sites at a later time, it is also possible that they took their business to competitors.

 

There are two types of downtime of course - scheduled (which usually is a very small window in the wee hours of the morning on a weekend), and unscheduled - the type you can't plan around and what happened on Black Friday last year. Managing IT Operations means we want to take actions to mitigate the possiblity of the unscheduled variety.

 

You may have heard of something called "5 9's of availability". This means scheduled uptime is 99.999%, which works out to about 5 minutes of unscheduled downtime in a year. That would be high availability - sounds like nirvana! Getting 5 9's takes work to attain and maintain. Most companies are happy to get 99.9% uptime. This doesn't mean you don't take systems down for maintenance - but that you manage your systems ro reduce the unplanned outages.

 

And in a nutshell, that's what operations management is all about.

Link to comment
Share on other sites

Interesting...thanks for sharing.

You are welcome. This happens every year and I know a few people who come on here get caught in the crash and do not get what they want. If is always safe to camp out. I hope the IT community will take note and work on this issue that happens every year.

Link to comment
Share on other sites

×
×
  • Create New...