How LightBits and OpenShift boost MongoDB performance in AWS

Blog republished courtesy of RedHat: How LightBits and OpenShift boost MongoDB performance in AWS, Author: Vivien Wang, Engineering Partner Manager, August 29, 2023.

MongoDB is designed and built for cloud-native applications. It has become one of the fastest growing database technologies in a fast-growing market boosted by AI, e-commerce, big data, and more. MongoDB, ranked fifth overall for databases worldwide, is used by over half of the Fortune 100 and 19 of the 20 largest banks.

MongoDB is the number one non-relational NoSQL database, using document-oriented technology that makes the developer experience simpler and more productive, especially for modern containerized applications. Instead of storing data in rows and columns, MongoDB stores data inside a collection of documents that are more flexible and extensible. That makes it perfect for e-commerce product catalogs and user profiles, multimedia content management, business intelligence, and more.

This article presents and compares the results of testing MongoDB transactional performance in Amazon Web Services (AWS) with two storage options—LightBits vs. EBS io2 Block Express.

A natural fit with OpenShift

MongoDB’s cloud-native focus makes it a natural fit for Red Hat OpenShift. MongoDB is great for dealing with large volumes of data with high performance and scales horizontally much easier than relational databases. Likewise, OpenShift provides exceptional scaling and efficiency for cloud-based applications, whether deployed on private, public, or hybrid clouds. As an enterprise-focused container platform, OpenShift delivers superior orchestration, making it easy to deploy and manage thousands of containers efficiently.

MongoDB provides an enterprise operator for OpenShift environments that makes it easy to deploy MongoDB databases into containers managed by OpenShift, and you’ll still be able to use their Cloud Manager and Ops Manager for automation, monitoring, alerting, and backup.

Why did LightBits Labs certify on Red Hat OpenShift?

LightBits Labs chose to certify their container images on Red Hat OpenShift to ensure optimal performance of their applications within Kubernetes and OpenShift environments. By obtaining Red Hat Container Certification, LightBits Labs demonstrates their commitment to providing users with a secure, supported, and reliable container solution. This certification guarantees that LightBits’ applications will run effectively and consistently across all supported OpenShift platforms, including bare metal and cloud deployments. The certification process includes seamless integration with OpenShift, full containerization, continuous vulnerability scans, and access to collaborative support, reinforcing the trustworthiness and high-quality performance of LightBits’ certified offerings.

Why storage performance makes a difference

Choosing the right storage for your MongoDB databases is critical for transactional performance. When your application’s working set of indexes and most frequently accessed data exceeds available memory, disk performance quickly becomes the limiting factor for throughput. At the same time, it can be challenging to ensure that performance scales with user demand and doesn’t outpace your cost projections.

That’s where LightBits comes in. LightBits provides a complete data platform that accelerates application performance on any cloud running on bare metal, containerized, or virtualized environments. LightBits excels in delivering extreme performance at scale, making it possible for organizations to move storage-intensive database, analytical, transactional, and streaming applications to the cloud with confidence.

LightBits Labs, a Technology Partner within the Red Hat OpenShift ecosystem, is officially certified by Red Hat and generally available on OpenShift 4.10 and 4.11. The LightBits operator functions as a Kubernetes application designed to deploy and oversee the management of the LightBits CSI driver.

Amazon EBS vs LightBits cloud data platform

We will present the results of testing MongoDB transactional performance in AWS with two storage options: Amazon Elastic Block Store with io2 Block Express versus the LightBits cloud data platform. The TL;DR: LightBits outperforms EBS by an average of 30% for various workloads even though EBS is about 2.5x more expensive than LightBits.

Test setup for Yahoo! Cloud Service Benchmark (YCSB)

Kubernetes: Since Kubernetes is one of the most common platforms to deploy, manage and run MongoDB we chose the recently released Red Hat’s Openshift Container Platform (OCP) version 4.13 — it is the only OCP version that has an EBS CSI driver that supports EBS io2 Block Express.

MongoDB: The MongoDB containers were based on the “latest” tag, (which at the time contributed to 6.0.4). Each container had a request/limit of 12 “cpu” and 2Gi memory (I explain more on the container resource settings later).

Benchmark: For the purpose of stressing all the MongoDB instances and the storage, I used the YCSB (Yahoo! Cloud Service Benchmark). While written a few years back, it is still one of the best benchmarking tools for creating collections in MongoDB and then measuring performance via controlling how much data we read vs how much data we update.

Performance information gathering: To run, monitor and collect all the required information  I used the sherlock database performance tool.

Storage: This blog compares and evaluates two storage options in AWS — EBS io2 Block Express and the Lightbits Cloud Block Storage offering via the AWS Marketplace.

Let’s delve into some important aspects of storage used for MongoDB:

  1. The EBS io2 Block Express is the fastest cloud-native block storage solution available in AWS. Using this type of storage you not only pay for the capacity but also have to decide upfront on the (provisioned) IOPS for each volume and pay a monthly fee per IO. While io2.bx has a limit of 1000 IOPS per GiB when creating a volume from the AWS console or via the AWS cli/API, the EBS CSI implementation has a limit of 500 IOPS per GiB, meaning for example, that to get to 125,000 IOPS using the EBS CSI you need to deploy a 300GiB volume — hence this size was used for all testing.
  2. Lightbits has no limits on IOPS per volume – you can impose limits via QoS. This implies that if your application requires a further increase in IOPS over time, there is no need to make any adjustments from the storage perspective. Moreover, you won’t incur any additional costs for higher IOPS, unlike with EBS.
  3. For the testing, I used Lighbits version 3.1.1 from the AWS marketplace. I deployed the minimal configuration of 3 x i4i.16xlarge instances as the building blocks for the Lightbits cluster — you can also create a Lightbits cluster from various i3en instances and different i4i instances, going from single NVMe devices per instance to 8 per instance.
  4. For the Kubernetes worker (compute) nodes I used the R6in.8x instance which supports io2.bx. Other EC2 instances are restricted to only use the slower io2 EBS volume type (as opposed to io2.bx). Lightbits imposes no such constraints and any EC2 instance can use Lightbits NVMe/TCP volumes with only the network bandwidth as the throughput limit.

 

Figure 1: A chart illustrating the Lightbits storage architecture.

Note that the diagram in Figure 2 excludes the OpenShift master nodes as they are not directly involved in storage access operations.

Below you can see how the cluster interacts with EBS block express architecture.

Figure 2: An illustration of EBS io2 Block Express architecture.

(Note: The diagram excludes the Openshift Master nodes as they are not directly involved in storage access operations).

Running the benchmarks

As shown in Figure 2, I ran in parallel a total of 6 MongoDB pods, 2 pods per worker node. For every MongoDB pod, a corresponding YCSB pod was deployed on the same node where the MongoDB pod is running, eliminating any out-of-node network traffic between the “client”/YCSB and the database. The Sherlock framework also deploys a tiny pod on each worker to collect statistics.

Each MongoDB was populated using the load function in YCSB with a record count of 105 million documents, resulting in a database size of approximately 250GB (on the file system). This size allocation ensures sufficient space on the persistent volume claim (PVC) to accommodate data updates during the testing phase. With a limit of 2Gi memory on each MongoDB pod, most of the IO operations bypass the cache layers and hit the storage. The distribution used was uniform to make the read probability of each row as equal as possible.

We used 90 threads per YCSB run when connecting to each MongoDB pod. Each run was 30 minutes.

We used 4 types of YCSB runs, alternating the values of the readproportion and updateproportion variables to achieve a read-only run, update-only run, 70% reads 30% updates, and 50% reads 50% updates.

Each run type executed 4 times.

Performance test results

The results in the graph in Figure 3 show the averages from all the pods running in parallel. As you can see, the pods using the LightBits storage were approximately 30% faster per run, demonstrating that LightBits storage is more efficient than io2.bx.

Figure 3: A graphical illustration of YCSB throughput(ops/sec).

YCSB number of operations

The following table shows the average of each run for each type of storage. Since these numbers basically derive the throughput results from the previous table, we see the same behavior of LightBits storage providing 30% more performance, on average, than EBS io2 Block Express volumes.

 

A couple of notes about the tests:

  1. MongoDB was limited with the performance it could provide in the sense that increasing cpu per pod – or cores per database – and also increasing threads didn’t provide more performance.
  2. We’ve used 125,000 provisioned IOPS for the io2 Block Express volumes since increasing the provisioned IOPS did not result in performance gain.
  3. During the load phase of YCSB – when the YCSB pod generates random data and inserts it into the MongoDB collection – we notice an inconsistent behavior when io2 Block Express volumes were used. The average load time – the load portion is also done in parallel on all 6 pods/databases – was ranging from 45 to 120 minutes. When loading the data using Lightbits volumes the load time was consistent on all pods and on all runs at around 30 minutes. This behavior is most likely due to EBS io2 Block Express design where reaching a volume maximum performance can sometimes take up to 48 hours. No such behavior exists in Lightbits.

Storage cost comparison

Total cost of ownership (TCO) is a major factor when making a decision on which storage to use for your MongoDB databases, so let’s go over some cost calculations.

The calculations are based on using six databases, each with 300GiB volume, running 24 hours for 30 days (basically a monthly cost). I have not included the pricing for OpenShift support from Red Hat, since it will be the same price whether you used LightBits or EBS.

Using us-east-2 as a sample region, the list price for running a LightBits cluster that consists of 3 x i4i.16xlarge instances for one hour is roughly $18, or $13,160 for one month.

Using EBS io2 Block Express with 125,000 provisioned IOPS will cost you $33,250, so roughly a 2.5x higher cost for using EBS io2 Block Express.

Since the compute pricing for both options will be the same – same number of workers nodes, same number of master nodes, same type of instances for both – if we add the compute price and look at cost per database per month, using the LightBits storage one MongoDB database will cost around $3200 vs $5550 using EBS io2 Block Express. That’s a savings of $2,350 per month or $28,200 annually per MongoDB database.

It is important to note that the size of the LightBits cluster used in this example would still have plenty of capacity to grow and can also provide storage to other applications running on your OpenShift cluster or outside the OpenShift cluster.

Anther factor to consider is that using EBS snapshots cost money. You pay for the action of taking a snapshot (copying the data to S3) and for the capacity used. In LightBits, snapshots and clones are part of the license and are also instant (as in, no data copy is required). LightBits allows you to create as many snapshots as you need without any extra fee.

In EBS, restoring from a snapshot can take a long time because EBS snapshots are stored in S3. In LightBits, the action of restoring or cloning a snapshot is handled at the same performance rate as a normal volume.

LightBits outperforms io2 Block Express

LightBits emerged as the clear winner, surpassing io2 Block Express with 30% better performance, faster restore times, and substantial cost savings. These advantages can enhance your application responsiveness, optimize your budgets, and help you manage your data more efficiently.

To make a fully informed decision, we recommend evaluating your specific workload requirements, considering factors such as scalability, throughput, IOPS, restore times, and cost. We expect you’ll find that LightBits is the ideal solution for high-performance MongoDB deployments on AWS.

With LightBits, you can ensure success for your MongoDB applications and save on your cloud costs today, and know it can scale as your business grows. The LightBits cloud data platform is also available on Azure and for building private clouds.

Additional resources:

About the Writer: