From Heuristics to Statistics: How we built a cost-based optimizer

January 16, 2020
Speaker: Rebecca Taft

Using a declarative language like SQL with a relational database provides a lot of benefits since it creates separation between the logical representation of the database and the physical representation. It gives us the ability to change the physical layout of the data without changing the queries or applications that use the database.

But it also means we need to bridge that gap somehow, and that’s where something like a query optimizer comes in. Query optimizers use statistics to find the lowest cost alternative among several possible execution plans. In this talk, CockroachDB Engineer Rebecca Taft will walk through the implementation and limitations of a heuristics planner and the development of CockroachDB’s query optimizer. She will explore a couple of examples that use multi-column stats and histograms, as well as those are distributed across thousands of miles.

