SpacetimeDB vs Convex: Performance Analysis Review

#database #performance #distributedsystems #devops

I first learned of SpacetimeDB through their recent talk in CMU Database Group's Postgres vs The World Seminar Series. I wanted to learn more so I went to their website and found this comparison to Convex. The first topic is performance.

SpacetimeDB can achieve more than 100,000 transactions per second in scenarios where Convex can reach only around 100.

This claim alludes to "scenarios" without providing any context or further details. Around this time, Convex released a blog post clarifying their general position toward this type of performance analysis and SpacetimeDB responded directly with an unapologetic retort, reading in part:

the article suggests that benchmarks are often apples-to-oranges [...]

They're right, it's not apples-to-anvils. It's more like...

See, there's one key point of comparison to which both seem reticent:

The 3 order of magnitude performance improvement you get from SpacetimeDB comes at a 5 fold increase in monthly data storage costs. This is a very good trade-off for many and a very bad trade-off for most. You will pay a premium (monetary and environmental) to park your data in RAM rather than on-disk. That means if your data doesn't change very often then these hangar fees you pay to store your data are likely to produce no benefit.

I've written before about the real world damaging effects of misleading marketing in the database industry. I feel a third party analysis might be welcome here given that these two competitors are accountable to intersecting sets of venture capital investors which represents a potential conflict of interest that might diminish their candor, aligning incentives against customers who just want honest assessments of trade-offs so they can make good decisions.

Nobody wants to hire someone like me to clean up business problems created by choosing the wrong tool or misusing the right tool because in my experience those types of problems are no fun for anyone.

So let's get ahead of it

Having operated platforms running third party code written in Javascript executing in sandboxed server side v8 isolates interacting with Postgres as a key value store through an SDK that makes use of Optimistic Concurrency Control ¹ at moderate scale for several years, performance comparisons like this highlight a familiar set of challenges I've solved in the past while also serving as a tech lead and manager.

We're going to break down the technical problem highlighted in this benchmark to its root, find the smoking gun and frame it in a more rational, unbiased context.

1. Optimistic Concurrency Control

When two stateless clients want to modify the same data and that data lives in a centralized data store, they must coordinate to ensure they don't overwrite each other's data. Pessimistic Concurrency Control involves acquiring exclusive locks on database rows (eg. SELECT FOR UPDATE), but that doesn't scale well under high concurrency.

With OCC, each client Optimistically assumes it's the only client operating on a set of data (the "read set") and lets the database reject any transactions where that assumption proves incorrect. Upon rejection, the client waits before retrying the transaction again and with enough retries it usually succeeds eventually.

2. The Scenario

Money transfers between accounts. That's the basis for the benchmark. Alice wants to send $1 to Bob. Charlie wants to send $1 to Doris. Eve wants to send $1 to Frank. That can all happen in parallel.

OCC excels in workloads like this where there is little to no overlap between concurrent transactions. But what if there is overlap? What if Frank is running a lemonade stand and everyone gets thirsty at the same time?

Now Frank's account balance is experiencing data contention. Each of these transactions must be performed sequentially to ensure that money isn't accidentally created or destroyed.

The first case exhibits 0% data contention and the second 100%. These are two ends of a spectrum that is modeled in the benchmark as a zipf distribution with configurable --alpha.

So let's run the benchmark with the default (--alpha=1.5) against our free S16 Convex instance and see the results.

Well, that's... bad. How about --alpha=0 (no data contention)?

Well, that's... 20x better. The big difference maker is the collision_rate (89.8% vs 1.6%).

So why did SpacetimeDB choose this "scenario"? What is the reason for using a zipf alpha of 1.5? In the absence of any defined methodology for the benchmark, we can only assume the obvious:

That's your smoking gun. This "scenario" is not based on scientific analysis of real customer workloads or tied to any workload in particular, it's a fictitious scenario invented to maximize the contrast between the performance of their two architectures.

3. The Dominant Strategy

DevOps is a philosophical movement promoting empathetic collaboration to better integrate developers with the operational reality of production environments. Platforms as a Service give feedback to developers about the performance of their apps and when things go wrong, the culprit is usually an architectural design choice well within the developer's control.

No matter what database you use, you will always be able to devise a workload for which it is not well suited. If you don't enforce a schema for your data, you're going to regret it later. If you store all your data under a single key as a giant JSON object, you will have problems scaling to thousands of users.

These principles always needed to be learned. Modifying your workloads to align with the capabilities of the environment is and always has been the way out for 95% of these problems. For that other 5%, there are other platforms optimized for those sets of problems.

No database will ever solve every use case optimally ².

Companies able to leverage multiple platforms by embracing their relative advantages will enable more valuable outcomes for users to win their markets.

Databases are a positive-sum game.