Author: Mathijs Dubbe, Senior Solution Consultant, Lightbits Labs Ltd.
Complex modern cloud-scale architectures running data-hungry applications have resorted to installing high-performance NVMe® Flash storage directly into their Database compute nodes. This Direct Attached Storage (DAS) model introduces expensive trade-offs: performance is increased but compute and storage are locked together, data services are limited, redundancy and availability are reduced, and the storage capacity is underutilized.
Database servers are often the most important workload that runs in data centers. They store a lot of the important information that the company needs to be able to operate properly. This data needs to be accessible all the time. Data loss due to hardware failures, or even simple downtime can be catastrophic and cost a company its competitive advantage.
This blog explains how to simplify data center architecture, increase performance, and reduce downtime by running highly available, high-performance databases on top of VMware and Lightbits storage. First, an explanation of what Lightbits is, what it does, and what makes it uniquely suited to run this workload.
Lightbits Solution
Installed on standard commodity servers, Lightbits Software-Defined Storage (SDS) runs on standard Linux distributions. Lightbits is optimized to deliver high performance and consistent low latency for I/O intensive compute clusters, such as Cassandra, MySQL, MongoDB, FDB, and time-series databases.
Lightbits is a highly available, clustered, multi-NVMe SSD data platform and NVMe over TCP (NVMe/TCP) storage target that requires no changes to networking infrastructure or application server software.
Lightbits offers an NVMe/TCP block volume over standard TCP/IP networking, while at the same time maintaining the characteristics of direct attached NVMe storage: high performance and low latency. Lightbits is designed to disaggregate storage from compute that provides:
- Increased Return on Investment: Utilize all of the storage purchased using thin provisioning and only buy more storage when needed for the Storage Server.
- Reduced Total Cost of Ownership: Design application nodes as compute-only—simpler, smaller and less expensive. Leave the storage management to Lightbits, running on industry-standard X86 servers.
- Separate Storage and Compute: Only upgrade the Storage Server or the Application Server when needed. No need to bundle and waste storage or CPU capacity.
- Increased Application Performance: When the application updates millions of records, the Lightbits Intelligent Flash Management operating on a pool of flash with inline data reduction dramatically reduces tail latency.
- Open Storage Platform: Bring your own storage server and install Lightbits.
- Simple, Safe Data Management: Data is protected from SSD failures using a highly optimized Erasure Coding (EC) algorithm managed in concert with the high-performance Lightbits Intelligent Flash Management, streamlining data management across the pool of SSD’s increasing performance and ensuring data is safe. Lighbits utilizes clustering to protect against failures of individual nodes, and together with replication, makes sure there is no data loss or application downtime.
- Persistent Storage: Offer persistent volumes to your workloads, making it easier to increase availability and move your workload around in virtualized and containerized environments.
Lightbits supports servers (clients) with a Linux operating system, running a compatible kernel with NVMe/TCP drivers inbox, and Kubernetes, OpenStack and VMware (7.0u3 and above).
Running Database Servers
Database servers in general, can be run on bare metal-, virtual- and containerized servers. Although bare-metal instances combined with Lightbits storage would offer the best performance and utilization, virtualization offers a few benefits over that in terms of manageability, and availability of the compute infrastructure. This blog explores virtualizing database servers using VMware, making them highly available on the database, compute, and storage levels.
Most database server technologies have replication or synchronization options in a primary/secondary or standby configuration. These are implemented to ensure data exists on a second database infrastructure, and offer some level of data protection and availability. Most of the time, this process isn’t synchronous, meaning there is either an interval set or it’s near real-time (i.e. as quickly as possible). Considering that, it’s still very important to have persistent storage to protect against data loss and keep the primary database server available as much as possible. This is where VMware and Lightbits come into play.
Running Databases on Lightbits Compared to DAS
There are several advantages to running your database on Lightbits volumes, instead of Direct Attached NVMe. At Lightbits we are proud to maintain similar NVMe level performance and latency to that of local NVMe drives in the application server. Performance utilization of the drives will be higher within a centralized NVMe based network storage, while maintaining similar low latencies.
Utilization
Running NVMe locally attached delivers the lowest latency possible, but it comes with some significant downsides.
Generally only up to 20% capacity is used when running NVMe locally. Drives are larger than what is actually in use. On the performance side, applications typically can’t utilize the full potential of NVMe drives, and are only able to use the performance it can partially offer. This means NVMe drives in the database server are generally underutilized. By centralizing your NVMe storage on Lightbits servers, both of these problems go away.
Data Services
When running on local storage, some data services like thin provisioning or snapshots and clones are not possible. Others, like compression, are a drain on the server’s CPU resources and therefore reduce overall database performance. Lightbits applies compression in-line, and by doing so often even increases performance, because there is fewer data to be written, and latency during replication operations is therefore reduced.
Running Lightbits as Storage for VMware
Starting with version 7.0 update 3 of VMware vSphere®, Lightbits is a VMware-supported and certified storage option, offering an end-to-end NVMe experience over standard TCP/IP networks. The Lightbits block device shows up as a volume inside of vCenter®, which can then be formatted with VMFS and used as a datastore for all connected vSphere hosts. Most large-scale environments benefit from the CLI or API availability within the Lightbits solution.
There are multiple different deployment scenarios for running Lightbits to support such an environment.
- Everything in one data center
- Lightbits cluster(s) across multiple data centers (if latency and networking permit)*
* When talking about separate data centers; different rooms within one data center, separate power zones, fire zones, etc. can also be applicable depending on the customer requirements and infrastructure set-up.
One Data Center
When running everything in one data center, highly available and high-performance databases can be achieved, but that obviously depends on the infrastructure of that single data center. Even though chances are low, power failures, network issues, fires, or evacuations due to outside influences do occur, often failures inside the data center are due to human error.
Lightbits across Multiple Data Centers
A Lightbits cluster(s) can be deployed in different ways across multiple data centers. There are various deployment options that can be looked at depending on the amount of data centers, the network capacity, and the latency between them.
For the sake of simplicity, this blog will focus on running within the same data center.
NVMe/TCP Initiator
vSphere, from 7u3, has NVMe/TCP driver part of the kernel, No additional software is required for clients to connect to the Lightbits cluster. The driver maintains connectivity to all of the Lightbits cluster nodes that carry the load of its volume replicas.
By checking out the nvme-subsys of the specific NVMe device, observe that it’s an NVMe/TCP connection that is connected to three storage servers. All of those connections are live, but only one is “optimized” meaning, this is the active (primary) server that will be serving all the IOs. If that server fails, the driver will seamlessly fail over to one of the other available servers that have the replica.
Example architecture
This example architecture consists of a single Lightbits cluster running a minimum of three nodes. Each node is its own failure domain, meaning that the cluster will ensure replicas are never stored on the same nodes. On top of the cluster, we run the volumes, that are configured with a replication factor of two (out of a maximum of 3). This means we’re running the primary volume, plus one replica. (N+1)
The network is made redundant by using redundant switches and network cards (NICs). The cards are configured in LACP, MLAG, or otherwise bonded together in Linux.
The following paragraphs describe the components in this architecture top to bottom.
Database Server
The database runs inside a virtual machine and is replicated to a second instance on different hardware. Using Affinity/Anti-Affinity functionality within VMware, it will be made sure these databases never run on the same vSphere hypervisor.
By replicating to another database instance, the database can protect itself from various failure scenarios. In this case, the reliance on this replication mechanism is to protect against software failure of the database server itself, its Operating system, filesystem errors, or any other software-related issues. Basically, anything that happens within the VMs guest Operating System. The Database failover mechanism needs to be configured so that it will only fail over in certain situations. It shouldn’t fail over automatically in all situations, because that might get in the way of the proper workings of some of the other protection layers we discuss in this potential solution.
The Hypervisor
Running VMware on top of Lightbits provides benefits over running a database on top of a bare metal and direct-attached storage. A virtual machine (VM) in the VMware cluster can be configured to be highly available. By running your virtual machine highly available, it will be restarted on a different hypervisor if the one it ran on went offline due to hardware failure. Note: This only works if the VM is run on a persistent datastore, like is the case with Lightbits volumes that are connected to vSphere hypervisors. Downtime due to hardware failure is reduced by only the time it takes for the VM to boot back up on another available hypervisor. (Usually, within 5-15 minutes.)
Lightbits Cloud Data Platform
The Lightbits solution uses clustering technology to ensure the integrity of the cluster. In addition, it uses an Elastic RAID mechanism to protect storage servers from failures due to individual drive failures. It’s a self-healing mechanism that doesn’t require user interaction or disk replacement in order to become healthy again. The cluster has a minimum size of 3 servers in order to achieve quorum and allow up to a replication factor of three.
The cluster has, from an availability point of view two functions, providing a persistent storage, for the VMware High Availability functionality to work, but also to keep that persistent storage available at all times. If a primary node fails (primary node to the specific volume), those few seconds, and servers will remain online and functioning.
Additional Added Advantages
Aside from higher availability, adding Lightbits storage to highly available virtualized database solution, has the added benefits of;
- Applying compression without utilizing the database server CPU, thus not impacting database performance.
- Applying thin-provisioning to make sure your database never runs out of space, by deploying large volumes, only taking up the space it actually consumes.
- Maintaining the high IOps NVMe has, by offering an end-to-end NVMe solution over TCP/IP, while at the same time keeping latencies as low as 160
- The flexibility of volume level snapshots and clones, allowing the user to quickly revert back to an earlier version, take snapshots before running updates, or clone a second instance for testing.
Conclusion
The Lightbits SDS solution enables disaggregation of storage and compute for Database workloads. It can reach the same transactional performance as DAS, despite all I/Os going over a standard data center network, without hurting average latencies, and often improving tail latencies. It can supercharge your cloud database applications and together with VMware provide:
- Improved service levels and a better user experience with reduced tail latency
- No changes to the TCP/IP network and no changes to application servers
- Increase flexibility and simplicity while running DBs in a virtualized infrastructure
- Increase DB uptime and availability, by offering quick recovery and failover options on all levels
- Reduce TCO by better utilization of capacity and performance compared to DAS environments and the ability to apply data services that do not impact DB performance.
- Reduce complexity, but maintain similar or better performance and latency by replacing costly FC, Infiniband or RDMA-based SAN solutions.
No more managing a growing number of database servers with their own underutilized silos of direct-attached storage inhibiting simple and efficient cloud-scale architecture. Add VMware in the mix and provide additional availability by making use of VMware’s availability features and at the same time have the simplicity of running virtualized.
Finally, separate storage from compute without all the drama.
For more information about the Lightbits solution, contact info@lightbitslabs.com.