Several years ago, Amazon introduced a low-cost EC2 usage option called Spot Instances. Spot Instances provides users with the ability to bid on spare EC2 computing capacity, and significantly reduce the cost of running applications as compared to On-Demand pricing. But despite the huge growth of Spot Instance usage, Amazon users today still fear using Spot Instances, with many AWS customers avoiding the Spot Market altogether.
Spot Instances are different from On-Demand Instances because they are acquired through a bidding process in which customers can specify a price per hour that they are willing to pay. The purchased Spot Instances are available as long as the customer isn’t outbid by another bidder. But with the very real possibility that a Spot Instance can disappear at a moment’s notice and shut down a company’s cloud computing environment, corporate decision makers are legitimately concerned that AWS’ Spot Market can be unreliable. This is considering that Spot instances can be shut down suddenly as the Spot price increases past their maximum bid. In addition, when Spot availability is full, capacity is not always guaranteed – leaving cloud customers with interruptions in services as they try to quickly switch to another instance. Data consistency, data loss, active sessions and HTTP requests are also an issue. For example, not knowing what happens to data on various network drives when a Spot Instance ends.
Spot Instance Fears, and Their Organizational Effect
So what are the implications of users’ fears of Spot instances? For one, CIOs will not put a test environment on Spot, for fear of their applications not being sufficiently fault tolerant and/or able to handle unexpected interruptions. A CIO will also need to have people develop the failover processes – or even worse – need to ask their system administrator to manually establish or reconfigure all or part of dev/test environments without real transparency into how much money they saved and at what cost. The CIO’s intuition is that the enterprise IT workforce required and the time the dev/test operations are down doesn’t align with the potential costs saving of Spots. Consider an enterprise application running on a cluster of tens of instances and there is a partial hardware failure. This scenario translates into a huge headache not only locally, but across the entire R&D and IT organization and business.
In addition, developers and testers who look to do their day-to-day jobs obviously don’t want to deal with the thought that their environment can fail at any time. This can lead to great frustration for an important and talented organizational workforce.
Other concerns with Spot instances include instances not being supported in all AWS services (for example, CodeDeploy, Beanstalk, OpsWorks RDS, Elasticache, Redshift, and others); and price changes – with users having no real control over changes in Spot instance pricing and not knowing exactly how much they’re saving.
There is a Different Way
While Spot instances have their challenges, the benefits of using them are significant. The benefit of Spot Instances is the ability to enable users to purchase machines separately as needed and enjoy spare Amazon capacity that others aren’t using at a large discount (up to 90% in Amazon’s case). For example, Lyft recently revealed that they use AWS Spot instances and saved up to 75 percent a month just by changing a few lines of code. And Inneractive recently utilized Spotinst’s services to guarantee capacity and service levels for their instances and to stop their jobs from being interrupted by terminating Spot instances. This helped them maintain performance and keep their EC2-related costs down.
Spot’s advantage is in its ability to essentially act as a ‘stock market’ for spare computing power, and in doing so resolve the tons of wasted EC2 computing power that AWS users leave on the table. Users should seriously consider these and the other long-term benefits and ROI potential of Spot. Even traditional enterprises such as Novartis have been using Spot instances for years now.
So what is the right evaluation process for Spot, and how can you get started without fear? As Gall’s Law states, ‘a complex system that works is invariably found to have evolved from a simple system that worked’. For example, you could set up a dev environment, get a good feeling and understand how the process works, then scale up slowly to have Spot as a certain percentage of your production environment. You could then shift to more and more Spot instead of On-Demand (a process that can be automated through Spotinst).
Why Spot is the Cloud Future
Interest in Spot instances continues to grow. On average, every week, AWS customers are using more computing capacity on Amazon EC2 Spot instances than customers in 2012 were running across all of Amazon EC2. And Spot for private clouds is also quickly approaching, as solutions like ours mature to provide a simple way for cloud Ops and R&D teams to better and seamlessly distribute their computing capacity – automatically balancing between required performance and cost. For example, using advanced technologies such as Docker containers, and platforms like Kubernetes, Mesos and Rancher can help with workload mobility across instances and even clouds, to help ensure the uptime required for a specific workload or a completed service.
Finally, as public cloud leaders’ offerings mature – including AWS, Azure and Google – more and more capacity will be wasted, and market-driven compute pricing will be inevitable. Instead of investing in new data center capacity, Spot users can gain almost unlimited capacity from AWS to their workloads, grow and scale their cloud computing deployments, and achieve greater efficiency at highly competitive prices.