For the first two years of CockroachDB’s existence, the Windows installation instructions read like this:
- Install Docker.
- Get the latest CockroachDB Docker image: docker pull cockroachdb/cockroach
The instructions were deceptively short, considering that they amounted to downloading a supported operating system (Linux), booting it in a hypervisor, and running CockroachDB inside of that virtual machine.
The layers of indirection in this scheme caused no shortage of problems. Docker containers excel at isolating a process from the outside world, but a distributed database like CockroachDB needs just the opposite: each node needs to communicate over the network with other nodes in the cluster and write persistent data to the filesystem. Fat-finger the docker run flags and you’d wind up with a CockroachDB node that writes its data to an ephemeral directory that would be deleted. Or perhaps you’d forget to expose the necessary ports and create three isolated one-node clusters instead of one three-node cluster. Even properly-configured clusters would occasionally run into what could only be explained as bugs in Docker itself, or at least details of the Windows filesystem leaking through the Docker environment.
Suffice it to say that CockroachDB on Windows was not a seamless experience. We viewed Docker on Windows as a last-resort solution, and were were hopeful that most prospective users of CockroachDB ran macOS or Linux instead of Windows. After all, we weren’t seeing that many bug reports from Windows users.
Or so we thought until earlier this year. In mid-March, we sent a contingent from Cockroach Labs to present at nwHacks, and they reported back some surprising news: the majority of the students they met at the hackathon were running Windows! Unsurprisingly, most of these students got stuck trying to install CockroachDB.
We had failed to consider that for every user who reported an issue with installation, potentially dozens more had encountered the same issue and given up silently.
So we decided to make the Windows development experience a company priority and invested significant engineer time in producing a binary that could run natively on Windows, cutting Docker out of the process entirely. By May, we’d succeeded: every release of CockroachDB since v1.0-rc.1 has shipped with a precompiled binary for Windows 8[^1] and later. CockroachDB on Windows is now as simple as downloading an executable and running it.
Full disclosure: Windows binaries are not rigorously stress-tested like the macOS and Linux binaries, and are provided as a convenience for local development and experimentation. Production deployments of CockroachDB on Windows are strongly discouraged!
Getting to this point, however, was anything but simple. Nearly every layer of CockroachDB made some assumptions that were invalid on Windows machines.
First, some background. CockroachDB is written in Go, and one of the best things about the Go programming language is its first-class support for Windows. For many Go programs, Windows support is as simple as running go build on a Windows machine. Indeed, many of the Go packages that make up CockroachDB immediately compiled cleanly on Windows. For example, the heart of our SQL execution engine, pkg/sql, worked on Windows on the first try. Why? The SQL engine makes no assumptions about the machine: it simply accepts SQL strings as input, and instructs the underlying key–value store to fetch the necessary data.
Unfortunately, the SQL engine is entirely unusable without the layers beneath it in the stack—like the key–value store—and the further you descend in the stack, the more you interface with the operating system. At the lowest layer, we link in a C++ storage engine, RocksDB, which employs all manner of tricks to squeeze out performance. (It has its own memory allocator and file I/O library, for example.)
RocksDB made available an experimental Windows port in 2015, so the hard work of abstracting away the differences between Linux and Windows APIs was already complete. The work was funded by Microsoft, however, and the Windows port only supported using Microsoft’s proprietary compiler, Microsoft Visual C++ (MSVC).
We’re committed to open source at Cockroach, so we wanted to make sure that our users could build from source on Windows using free, open source technologies. The MSYS2 distribution platform makes this possible by providing Windows versions of many open-source Linux tools, like GCC, Make, and Bash.[^2] In theory, this means a build system that works on Linux will “just work” on Windows. Where you would have to adapt compiler flags for Microsoft’s compiler, you simply use MSYS2’s GCC. Where you’d have to rewrite a Bash script in PowerShell or Batch, you can simply use MSYS2’s Bash.
Of course, reality is never so simple, and MSYS2’s packages don’t perfectly emulate their Linux counterparts. We had to submit several patches upstream to RocksDB to make their build system compatible with MSYS2.[^3] One particularly amusing case was discovered while we were trying to figure out why a test took nearly five times longer to run on Windows than Linux. The answer? Building RocksDB in an MSYS2 environment did not turn on compiler optimizations. Whoops!
Impressively, our three other C and C++ dependencies—the Snappy compression library, the jemalloc memory allocator, and Protocol Buffers—all compiled without a hitch in an MSYS2 environment. We owe those project maintainers a debt of gratitude for their cross-platform compatibility efforts, as well as the RocksDB maintainers for testing and accepting our patches.
With our dependencies in line, it was time to turn our focus to our own product and remove all the Unix assumptions we’d baked into CockroachDB over the years. The broad classes of fixes we made are listed below, along with some examples. Most of the incompatibilities were textbook examples of platform-incompatible code, but are useful empirical evidence of where it’s easy or tempting to subvert Go’s platform abstractions.
- Avoiding hardcoded file paths. The null device, which silently discards all data written to it, is spelled /dev/null on Unix but NUL on Windows. The temporary directory is $TMPDIR on Unix, but %TMP% or %TEMP% on Windows. Go conveniently provides os.DevNull and os.TempDir(), which always point to the right places (ae3e74c, bee33e3).
- Avoiding symlinks. Windows, by default, requires administrator permission to create symlinks. It’s usually not worth asking users to change the system configuration or to run the binary with administrator permissions when symlinks are easily avoided (4aeef50, 34f8629).
- Using raw signals and syscalls carefully. Go allows you to make raw system calls, but nearly everywhere we did so needed to be wrapped in an if !windows check (0011737, 2555400, 4b91617).
- Baking in some Windows assumptions. It’s was worth baking in some Windows-specific tweaks to keep our users happy. Windows machines don’t ship with a way to decompress tarballs, so we taught our build system to package CockroachDB in a Zip archive instead (bb60bab). Also, most Windows users also expect executables to end in .exe, even though it’s technically possible to execute an unsuffixed executable file, so we special-cased the suffix in the build system (dcd183e).
In the process, we even bumped up against a few shortcomings in Go’s support of Windows, including a broken os.Stat (2253a66) and a missing API (1e4778a). Both of those issues are happily fixed in the latest release of Go.
With all those fixes in place, it was now possible to build Cockroach on Windows! Unfortunately, even a simplified version of the installation process read like this:
- Download MSYS2 and click through the installer
- Install GCC, Bash, Make, and Git via MSYS’s pacman
- Install Go 1.9
- Build a RocksDB static library from source
- Build a jemalloc static library from source
- Build a Snappy static library from source
- Build a Protocol Buffers static library from source
- Move the static libraries into a hardcoded location in your $GOPATH
- go get -u github.com/cockroachdb/cockroach
- cd github.com/cockroachdb/cockroach && make build
Far too complicated! On other platforms, we would automatically build the right versions of our C and C++ dependencies from source; on Windows, you had to figure out how to build them yourself. This was annoying for our users and, more pressingly, meant that we couldn’t automatically build Windows binaries for our releases.
The underlying problem was that go build could not use the MSYS2-related fixes we’d made to RocksDB’s build system. RocksDB ships nearly 1000 lines of CMake to determine the proper compiler and linker flags, with support for many edge cases that have been discovered over the years. go build, however, has no support for integrating with other build systems, or even dynamically determining compilation flags. We’d had to throw out RocksDB’s build system and had hardcoded one set of compiler flags per platform. We’d survived with a set of hardcoded compilation flags that worked on most Linux machines and another set that worked on most macOS machines, but adding a third set that worked on most Windows machine was taking it too far.
So we did as many large Go projects do: we abandoned go build in favor of a tried-and-true tool, GNU Make (#14840). To build Cockroach, you must instead run make build. Our Makefile checks to see whether you have an up-to-date build of RocksDB (and the other C dependencies); if you do not, we simply invoke RocksDB’s native build system, which works out how to build on your platform. Only after all C dependencies are up-to-date does Make invoke go build. The downside is that you can no longer run go get github.com/cockroachdb/cockroach to build Cockroach or vendor it in your project. The upside, of course, is that we now have Windows binaries.[^4]
So please, if you use Windows, download the binary, take CockroachDB for spin, and let us know how it goes![^1]: Versions of Windows before Windows 8 did not provide access to the precise timekeeping that CockroachDB needs. [^2]: MSYS2 is a lighter-weight version of cygwin and the successor to the seemingly-defunct MinGW project. [^3]: The patches were, in no particular order: #1910, #2051, #2097, #2107, #2161, #2315. [^4]: The build system overhaul additionally fixed most, if not all, compilation errors on FreeBSD, OpenBSD and IllumOS. These platforms are not supported by Cockroach Labs, but producing a working CockroachDB binary on any of the three should require only a straightforward translation of the Linux “Build from Source” installation instructions.
Illustration by Ayesha Rana