viernes, 7 de diciembre de 2012

Cloud SLAs: a technical point of view


Initially, when I started the first version of these comments, I was decided to speak about Cloud SLAs (Service Level Agreements), because I reminded the contribution of some friends of mine in the last ISACA conference I assisted about its importance (and in spite of that it was lightly treated in the conference discussions because of other subjects capitalize the attention) and ‘cause in my point of view they’re almost nil in the current Cloud contract, as a recent Gartner’s study states. In fact, it is a subject I faced in previous posts: some in English (“Real” Cloud Computing Services vs. “Fake” Cloud Computing Services), and some in Spanish (El Cloud Computing y la ISO-20.000 e ITIL: conteniendo el sobre-aprovisionamiento – Capacity Management).

But, I don’t know how, a few lines after I was speaking about Cloud risks in general, and a few lines further, I was thinking of security concerns, but trying to don’t speak again only about privacy and regulations compliance (as in many discussion everybody only thinks), but also about data lost, data integrity, and so on, remember what CIA means for security: Confidentiality, Integrity and Availability, where everybody adds a second A for authenticity, in both senses, to avoid repudiation and to grant information access only to the correct persons. Then I reminded myself that although without doubts security is the most important Cloud risk, but not the only one and it is even losing relative importance against others; in fact, as the time goes by, new risks are becoming more important for the CSOs (Chief Security Officers) of big companies (75% of them “are confident in security of their data currently stored in the cloud”, according to a recent VmWare report) that are already using Cloud Services: portability, vendor lock-in, standardization, learning curve, and so on (see my Spanish post  ¿”Nubarrones” en la Nube? whose title means something like “Dark clouds over the Cloud?”), and we must not forget SLA incompliance, that was the idea I was trying to write about.
 
And then, when I came back to the initial point, I decided to throw all the written so far in the trash (and the time spent), and to focus only in this subject and trying to not wide (I’ll treat the other subjects in next posts) because otherwise this is going to become confuse and difficult to understand, and to use the information that more clever minds have written. So, here I go:

According to Gartner’s analyst Lydia Leong, Amazon Web Services (which Gartner recently named a market-leader in infrastructure as a service cloud computing, and I think everybody agrees) has the dubious status of “worst SLA of any major cloud provider”.

However, HP’s newly available public cloud service could be even worse. By the way, HP launched the general availability of its HP Compute Cloud on Wednesday, December the 5th  (HP Cloud Compute is a pay-as-you-go Cloud IaaS service that the company first announced earlier this year as a beta program and now is generally available) and it’s based on OpenStack (my favorite open Cloud Platform, I’ll promise to show my reasons about in a future post and compare it against other open platform).
 
Despite of AWS has voluntarily refunded customers impacted by major downtime events even when the SLA did not require it (AWS has experienced three major outages in the past two years), as I underlined it has been voluntary decision, not a consequence of signed SLA. In fact, Ms. Leong recommends investigating cyber risk insurance, which will protect cloud-based assets, because of the SLA requirements basically render the agreements useless. “Customers should expect that the likelihood of a meaningful giveback is basically nil”, she sayd. The main reason for this statement is the strict requirements of service architecture forced by Amazon and HP:
  • Both AWS and HP impose strict guidelines in how users must architect their cloud systems for the SLAs to apply in the case of service disruptions, leading to increased costs for users.
  • AWS’s, for example, requires customers to have their applications run across at least two availability zones (AZ), which are physically separate data centers that host the company’s cloud services. Both AZs must be unavailable for the SLA to kick in. Of course, that does imply bigger costs.
  • HP’s SLA, only applies if customers cannot access any AZs. That means customers have to potentially architect their applications to span three or more AZs, each one imposing additional costs on the business.
However, this isn’t the only reason, others aspects of the SLAs contribute to void its effectiveness as a user’s control and defense: besides SLAs are also unnecessarily complex (“word salads”) and limited in scope. For example, both AWS and HP SLAs cover virtual machine instances, not block storage services, which are popular features used by enterprise customers. AWS’s most recent outage impacted its Elastic Block Storage (EBS) service specifically, which is not covered by the SLA: “If the storage isn’t available, it doesn’t matter if the virtual machine is happily up and running — it can’t do anything useful”.

Next post I’ll come back on this subject and I analyze the contracts usually signed about SLA cloud, the terms they include, what is its impact, and how often they are present in the contracts.

No hay comentarios:

Publicar un comentario