How We Built a Vectorized SQL Engine
Get blog posts to your inbox. CockroachDB is an OLTP database, specialized for serving high-throughput queries that read or write a small number of rows. As we gained more usage, we found that customers weren’t getting the performance they expected from analytic queries that read a lot of rows, like large scans, joins, or aggregations. In April 2018, we started to seriously investigate how to improve the performance of these types of queries in CockroachDB, and began working on a new SQL execution engine. In this blog post, we use example code to discuss how we built the new engine and why it results in up to a 4x speed improvement on an industry-standard benchmark. OLTP databases, including CockroachDB, store data in contiguous rows on disk and process queries a row of data at a time. This pattern is optimal for serving small queries with high throughput and low latency, since the data in the rows are stored contiguously, making it more efficient to access multiple columns from the same row. Modern OLAP databases, on the other hand, typically are better at serving large queries, and tend to store data in contiguous columns and operate on these columns using a concept called vectorized execution. Using vectorized processing in an execution engine makes more efficient use of modern CPUs by changing the data orientation (from rows to columns) to get more out of the CPU cache and deep instruction pipelines by operating on batches of data at a time. In our research into vectorized execution, we came across MonetDB/X100: Hyper-Pipelining Query Execution, a paper that outlines the performance deficiencies of the row-at-a-time Volcano execution model that CockroachDB’s original execution engine was built on. When executing queries on a large number of rows, the row-oriented execution engine pays a high cost in interpretation and evaluation overhead per tuple and doesn’t take full advantage of the efficiencies of modern CPUs. Given the key-value storage architecture of CockroachDB, we knew we couldn’t store data in columnar format, but we wondered if converting rows to batches of columnar data after reading them from disk, and then feeding those batches into a vectorized execution engine, would improve performance enough to justify building and maintaining a new execution engine. To quantify the performance improvements, and to test the ideas laid out in the paper, we built a vectorized execution engine prototype, which yielded some impressive results. In this tutorial-style blog post, we take a closer look at what these performance improvements look like in practice. We also demonstrate why and how we use code generation to ease the maintenance burden of the vectorized execution engine. We take an example query, analyze its performance in a toy, row-at-a-time execution engine, and then explore and implement improvements inspired by the ideas proposed in the MonetDB/x100 paper. The code referenced in this post resides in https://github.com/asubiotto/vecdeepdive, so feel free to look at, modify, and/or run the code and benchmarks while you follow along. What’s in a SQL operator? To provide some context, let’s look at how CockroachDB executes a simple query, SELECT price * 0.8 FROM inventory, issued by a fictional retail customer that wants to compute a discounted price for each item in her inventory. Regardless of which execution engine is used, this query is parsed, converted into an abstract syntax tree (AST), optimized, and then executed. The execution, whether distributed amongst all nodes in a cluster, or executed locally, can be thought of as a chain of data manipulations that each have a specific role, which we call operators. In this example query, the execution flow would look like…
Like to keep reading?
This article first appeared on cockroachlabs.com. If you'd like to keep reading, follow the white rabbit.