Anatomy of a Lightbits Cluster on AWS

How much do you really know about cloud block storage?  Have you tried to get a handle on your cloud storage spending?  I am a cloud storage junkie at home, but I never really looked at how much it costs to keep the storage part of your workload in the cloud.  S3, great, sign me up!  Cost-efficient, linear pricing, and easy. AWS Elastic Block Storage (EBS), now we are in a space I needed to get up to speed on, and fast. Fast forward a few months. I’ve taken some Amazon Web Services (AWS) training and learned about my options.

EBS is a scalable, high-performance block storage service designed for AWS EC2 (Elastic Compute Cloud). It is a persistent way to store your data with EBS volumes.  Here’s a link to the pricing page: https://aws.amazon.com/ebs/pricing/.  It’s not a simple table, unfortunately.  There are base costs, then the upcharges for more space, less latency, and more IOPS. There is a free tier, which is pretty nice for a single user and I plan to get to know EBS a lot better in the coming year using this free tier.

Most businesses, however, can’t get away with only 30 GB of data in their cloud operation. They have relational databases, like Oracle and Postgres, or are running any of the NoSQL databases out there. These are all high-performance databases that require high-performance storage. That being said, EBS costs contribute to a large portion of your cloud bill. Knowing your options here is very important. This isn’t a decision you should commit lightly to as you could wind up spending double what you need to, or more, quite easily.

The menu for EBS is quite large. You can create anything from a cold storage HDD tier (Let’s just assume S3 is off the table here), to io2, their most performant flash. I’ll go into a little more detail below:

First, you have to tell them how much space you want, and how many 9’s of reliability, and they take it from there. EBS volumes are divided into two main categories:

  • HDD-backed storage for throughput-intensive workloads
  • SSD-backed storage for transactional workloads

SSD volumes are divided into two categories: EBS Provisioned SSD for demanding workloads and General Purpose SSD for transactional workloads. Spinning Platter volumes are also divided into two categories, Cold HDD for less frequently accessed data and Throughput Optimized HDD for frequently accessed data.

EBS volumes are attached to EC2 Instances and retain information thus accruing considerable costs in the process.

There are many ways to manage your EBS costs. I’m going to skim over a couple below using the native EBS tools, then I am going to talk about a better way. Don’t get me wrong,  AWS is a business partner of ours, and they have done a great job. You have dozens of choices in terms of the compute and storage instances that you pair. Choosing the right one is very difficult, and choosing the wrong one can be very costly.

Select the Right EBS types

Each of the EBS volumes comes with different pricing and performance levels. To land the best EBS volumes, you must consider the following factors:

  • Capacity
  • Latency
  • Input/output operations per second (IOPS)

Consuming EBS is really, really easy. It attaches naturally to your AWS compute instance. And for many workloads, this is the right answer. It’s just there, it’s tightly coupled with compute, and easy to consume.  There are also enterprise storage features available to you such as compression, thin provisioning, cloning, and snapshotting, but at an additional cost.  With native EBS you either pay the fees or you don’t get the features.

Then, some complexity comes in when you take these more demanding workloads and try to move them to the cloud. It forces you to ask more questions.  One of the questions you’re going to be forced to ask is: for each of those instances, workloads, and database workloads that you’re moving over, how many IOPS are required? What’s the latency sensitivity of this specific workload? If you’re moving these, en masse, knowing that about each of those instances separately is a big ask, usually you don’t know.

What are your choices here?  I’m going to use cars as an analogy here since most people can relate to them on some level. You can take a couple of approaches. One, you go and get the most expensive car since it’s the fastest and you need speed. The thing is, you are spending a lot more money than you would spend on-prem for that same “performance”.  This isn’t the most fiscally responsible way to accomplish this.

If you try and lift and shift your data protection and replication practices, you’re going to be charged an upcharge, for the frequency of snapshots for replication, for all those things that you’ve kind of taken for granted with enterprise storage because it’s all included on-prem.   Things like I mentioned earlier, compression, thin provisioning, cloning, and snapshotting.

Similarly, as you increase IOPS, that’s another factor that you’re charged for by AWS.

If you take a more conservative route, and you try and right size, over time, the performance profile will change, usually increasing.

Instead of a dynamic elastic, resizing of the underlying storage instance, usually, you have to take the volume offline, put in a more performance EBS instance under the covers, and reconnect it. This obviously affects data availability. It’s also pretty labor-intensive and disruptive.

Don’t get me wrong, AWS is a business partner of ours, and they’ve done a great job. You have dozens of choices in terms of the compute and the storage instances that you pair. But choosing the right one is very difficult and can be very time-consuming and costly. What would you say if I said there is a better way?

We have put a software-defined SAN in the cloud on AWS. It really gives you a lot- a feature-rich, incredibly fast NVMe/TCP storage platform you can connect your EC2 instances to, with all the enterprise storage features you use and need, all included in the software license.

I’d like to also point out that on the compute side, you don’t have to change a thing. Everything that you’ve constructed in your VPC, in terms of the compute instances, the security, the networking, all that remains the same, you’re simply pointing at a different storage target, which in this case, is you guessed it, Lightbits. We provide resilient, fast block storage on AWS.

How does it work? Well, you chose the EC2 instance that matches up with your workload’s performance requirements. Are you running databases on AWS?  Maybe they are high-performance databases on AWS, and you can’t get performant enough storage at a price your CFO will tolerate. To be honest you probably already have that part chosen right? Here’s some good news, with Lightbits you don’t have to change a thing.

We are software-defined storage that runs on a variety of EC2 instance types. Our architecture is a minimum of a three-node cluster, but within that three-node cluster, you can have fractional consumption, meaning you can take a quarter of those AWS instances, half, or all of them, or you can scale out if you want to grow.

So that elasticity, that non-disruptive growth or contraction of the Lightbits platform is something you don’t get natively with the cloud. You also get all those enterprise-class features you require like compression, thin provisioning, snapshots, and more built-in at no additional cost. So all those extra snapshots you want won’t blow your budget.

Also, the upgrades are behind the scenes, as is the healing. Essentially when you have a failed node and it gets replaced, it happens automatically. If AWS takes a node offline, as they sometimes do, the replacement of that node is completely transparent to you and happens behind the scenes.

We really want to make a low-touch, cloud-like experience for Lightbits.  We want it to be like what you would get with native cloud features.  We also want it to be economically palatable, which means you get predictable performance, scale, and cost.

The better way is here.

  • up to half the storage cost vs. Amazon EBS
  • up to 75 million IOPS
  • Consistent ultra-low latency
  • Predictable pricing. More IOPS won’t cost you more money.

 

graph

Which line on this graph do you want your budget to pay for?  Yea, us too.  If you want to learn more, reach out to info@lightbitslabs.com.

 

Additional resources:

 

About the Writer: