AMD vs. Intel and more: What’s new in the 2022 Cloud Report

The 2022 Cloud Report is finally here, and we think it was worth the wait!

In this completely free 70+ report, you’ll find the results of over 3,000 OLTP test runs analyzing 56 different instance types across AWS, GCP, and Azure for performance and price, as well as hundreds of additional runs benchmarking CPU performance, network latency and throughput, and storage.

Want a little taste of what’s inside? For the first time ever, we saw machines with AMD chips eclipse Intel: instance types with the latest-gen AMD Milan CPUs took the top spots for both performance and price-for-performance in our OLTP benchmarking.

< Get your free copy of the Cloud Report >

What’s new – the big picture

If we had to sum up what’s changed in this year’s version of the Cloud Report in a single word, it’d be: more.

This year, we tested more instance types – 56 in total, with 107 different overall configurations (since we tested multiple storage options for most instance types).

We tested more node sizes. In previous years, we’ve focused on instances with the same CPU core count, but this year, we wanted to dig even deeper to see whether core count affected performance for OLTP workloads (spoiler alert: it did). So we tested both smaller (8 vCPU) and larger (~32 vCPU) nodes, and shifted our focus to per-vCPU metrics that allow us to directly compare performance across different-size instances.

We ran more tests. A lot more. In fact, our OLTP benchmarking alone included more than 3,000 runs, as we tested each instance type multiple times across a variety of workload complexities to establish not only performance, but also the extent to which performance varied as workload complexity increased. The increased testing – it’s roughly 3X as many runs as last year – also allowed us to achieve a margin of error below 1.5%.

We also ran some tests for more time. For example, we increased our throughput tests from 60 seconds to 12 minutes to see whether we’d find anything interesting that wasn’t revealed in the 60-second tests from previous years (spoiler alert: we did).

We added more realism. For example, since CockroachDB is a durable application that executes an fsync at the end of each write transaction, we added fsyncs to our storage benchmarking to get a better idea of the read and write IOPS that real-world users could expect to see.

Altogether, it adds up to more depth. The Cloud Report has always been a deep dive into the realities of running OLTP workloads with AWS, GCP, and Azure. The 2022 Cloud Report is the deepest yet – at 78 pages, it’s more than twice the length of last year’s!

Don’t worry, though – it’s still completely free!

What’s new – the technical details

The biggest change we made this year was to our most important benchmark, the OLTP benchmark. We scaled up the number and variations of our OLTP benchmark dramatically. We also redesigned our TPC-C-derivative benchmark to allow us to scale the benchmark with a fine granularity using a fixed load multiplier per vCPU. We’re calling the new benchmark Cockroach Labs Derivative TPC-C nowait.

As the name suggests, our benchmark is based on TPC-C. TPC-C is a standards-based benchmark from the Transaction Processing Performance Council that attempts to simulate a real-world logistics system, creating and taking virtual customer orders from initial receipt through the manufacturing process and then out for delivery. It imitates real-life user interactions with the system by limiting how fast warehouses move orders through the fulfillment process.

As warehouses are added to the system, both query complexity and system load increase. However, this means that the core metric of TPC-C (transactions per minute, i.e. the number of new orders that can be processed per minute while the rest of the workload is running) is not directly comparable across runs with different warehouse counts. Also, because of the simulated wait times, you need to be relatively close to over-saturating the database before differences between different cloud configurations become apparent.

In an effort to compare across instance types and instance sizes fairly, we attempted to separate scaling the number of transactions processed from the complexity of the workload by removing wait times from this year’s testing. This allowed us to get a better comparative signal across instances all running the same database.

Our testing with the Cockroach Labs Derivative TPC-C nowait benchmark used the following configuration parameters:

Wait times disabled (wait=0 in our test harness).
“Warehouse” per vCPU scaling (50, 75, 100, 125, 150 warehouses per vCPU). Since query complexity scales with the warehouse count, we picked a discrete warehouse count (1200 warehouses) for our direct performance comparisons between large and small nodes.
Set the “Active Workers” count to be equal to the “Warehouse” count.
Set the number of active connections at the load generator as equal to 4x the number of vCPUs, as per CockroachDB’s production guidelines.

We also collected information to determine whether or not we got identical nodes (i.e. CPU info including the number of NUMA nodes our vCPUs are running across), to identify the impact that the number of NUMA cores had on performance (with some interesting results – see the report for more details).

Full details and reproduction steps for this benchmark (and all of the other benchmarks we ran) are available in the report.