Earlier this month, Microsoft’s much hyped Azure cloud went down because somebody lit a cigarette or something and triggered a fire alarm that shut down the whole show for some hours. The official reason for a seven hours down town was: "During a routine periodic fire suppression system maintenance, an unexpected release of inert fire suppression agent occurred. When suppression was triggered, it initiated the automatic shutdown of Air Handler Units (AHU) as designed for containment and safety."
We’ve always mistrusted any tech company that says our data is safe in their hands because it’s clearly not. This is as trivial as the Chernobyl melt down, as the incident started in the same way. According to Microsoft Azure official status history the service was down between 13:27 and 20:15 UTC on 29 Sep 2017. The company calls this the "summary of impact".
The term “cloud”, for instance is a mistake because the fact is the so-called cloud is misty and really made up of giant server farms. So that’s hardware, and guess what, hardware fails, right?
Users affected by this minor glitch could not access their data for close to seven hours. This has a huge impact as think about how a lawyers' office couldn't access the files for seven hours. The server world as well as the cloud recognizes and relies on SLA Service Level Targets. We know as we have had servers for decades and we are in the business or serving content to the readers around the clock. When you get a 99.999 SLA or five nines you expect a reliable service that it should not go down more than 25.9 seconds a month. Using the SLA calculator we got a disturbing number. On that day Microsoft Azure had a SLA of close to 71.6 percent, which is more than catastrophic.
In the publishing world, can you imagine if the Wall Street Journal was down for seven hours, it would lose roughly one third of its revenue for the day. Of course, under SLA, Microsoft was probably paying some saucy penalties to its customers.
The worst thing is that this incident got mostly unnoticed unless you were a company that was affected. Cloud computing is a big market where Microsoft, Amazon, Google and IBM are among the biggest players and they all compete for reliability.