Early Days at Google & Building CockroachDB with Peter Mattis

Peter Mattis

Co-Founder and CTO of Cockroach Labs

Never miss an episode

Spotify
itunes
stitcher
google
youtube

“I guarantee you no CIO will ever write a check to Cockroach Labs. You have to change the name."

Peter Mattis, Co-founder and CTO at Cockroach Labs, was given that unsolicited advice from an audience member at a conference in 2018. Five years later Cockroach Labs has cashed plenty of checks signed by CIOs from some of the biggest companies on the planet. In this conversation, Peter goes way under the hood to talk about how CockroachDB is architected, how that architecture has evolved, and how the architecture will continue to evolve to make developers’ lives easier. .  Join us as we discuss:

  • The origins of GIMP and early days at Google
  • How Peter built Pebble & how it compares to RocksDB
  • Lessons learned building Postgres compatibility
  • What the future of “serverless” databases will look like

Tim Veil:

Well, welcome to another edition of Big Ideas in App Architecture. Today, I am very excited to have on the show the co-founder and CTO of Cockroach Labs, Peter Mattis. Peter, welcome to the show. Before we get kind of into the thick of architecture, and big ideas, and all this other stuff, I always find it super interesting to start with kind of your history. How in the world did you come to found and start this very interesting company? Maybe if you can just start by telling us a little bit about you, a little bit about your background, and then we’ll kind of slowly get into the history of Cockroach Labs and all things cockroaches.

Peter Mattis:

All right. Sounds good, Tim. Well, thanks for having me. Glad to be here. My history goes way, way back. I’ve been in industry going on 25 years now. Something like that. You know, I got my undergraduate degree at Berkeley. While at Berkeley, I actually met my co-founder, our co-founder, Spencer Kimball. He was my roommate. He was actually my brother’s roommate first, and then he was my roommate. We quickly understood that we were like-minded. We loved software and what not. Back to my undergraduate degree. I actually got started in graphics. Graphics was my entry drug into computers. You know, kind of gaming, but more on the graphics side. Spencer and I started this little program that got a bunch of traction afterward called GIMP. I did graphics all through my undergrad days. Then I graduated, and I was kind of tired of graphics, and I moved away from it. I joined a startup one of my professors had founded called Inktomi. They were an early search engine. They eventually got bought by Yahoo. Earliest days at Inktomi, I was involved in doing network programming, distributed systems programming. Not really stuff I’d really done much of in college, but I really liked it. It was a different challenge. From there, bounced around to a couple of other startups, and then eventually landed at Google. I was at Google for just shy of a decade, and worked on a lot of cool distributed systems problems at Google. I was an early architect of the Gmail search and indexing back end that worked on distributed file systems, worked on their Google Three build system, which now is known externally as Bazel. Internally, inside Google, it’s still known as Blaze. Got deep exposure to a lot of the big distributed systems problems Google was tackling and how they were tackling it. Then, eventually, I decided my time at Google was up. I moved on. Spencer and I actually did a startup before Cockroach Labs called Viewfinder. It was a mobile photo sharing site. Think Instagram. Think Snapchat. But we didn’t have the secret sauce. We were getting started right at the same time. We could’ve had success. We didn’t have success. We’re not consumer guys. We’re back end engineers. It was Viewfinder that was actually had the original idea for CockroachDB. I mean, what essentially happened is we got outside of Google, and we looked around at the ecosystem that was out there. We’re like, “What can we use for back end storage for these photos and messages in our mobile photo sharing site?” We weren’t terribly fond of what we saw. You know, Cassandra’s out there. HBase was out there. React was out there. We’re like, “Well, let’s sketch the design for how we would do this.” We did, and then we decided, “No, no, no. This is a complete tangent from what we’re doing at this startup,” so we put it on the shelf. Then that startup floundered. We got acqui-hired by Square. Inside Square, we saw the exact same problems, the need for a distributed horizontally scalable database. Kicked around some ideas at Square, and eventually left Square to form Cockroach Labs. I was just in there from the beginning with Spencer, and then we had our other co-founder, Ben, coming along. That’s in a nutshell.

Tim Veil:

When did y’all meet Ben in that journey? Was that at Google, or was that before?

Peter Mattis:

That was at Google.

Tim Veil:

Okay.

Peter Mattis:

I started at Google in April 2002. I think it was April 1st of 2002. I remember that because April Fool’s Day. As soon as I got there, I was like, “Wow. This place is awesome.” I convinced Spencer, “You have to join.” Helped get Spencer interviewing. Helped get him his job, so he owes me big time for that.

Tim Veil:

What did he want to do? He didn’t want to?

Peter Mattis:

He was kind of on the lam right then. You know? Just kind of floating along. You know, not quite sure. He’d had his own startup at the time that had gone under, and he wasn’t quite sure what he wanted to do. I was like, “No, no. This is it.”

Tim Veil:

This is the way.

Peter Mattis:

“You have to do it. You’ve got to come to Google.” This was happening all over the place. Everybody was referring their friends to Google. Once you get in there, you refer your friends. How did we meet Ben? Spencer and Ben started on the same day at Google, and they became-

Tim Veil:

That’s funny. I don’t think I knew that.

Peter Mattis:

Partners. Yeah, they became partners working on the same project.

Tim Veil:

Now, I should know this. You were in at 2002, but Google started when? I mean, this was still very early at Google. Right?

Peter Mattis:

Yeah. I think they started in 1998, if I’m remembering correctly.

Tim Veil:

Okay, so just a couple years.

Peter Mattis:

They were a couple years in, four years in, but they were pretty sizeable at that point. I think 300 something, 400 employees.

Tim Veil:

Wow.

Peter Mattis:

They were growing like gang busters already by that point.

Tim Veil:

I can imagine. I want to go back just a little bit further, because I think I’ve shared this story with you before. You know, obviously, I read about you when joining the company, and your history, but when somebody told me that you had founded this graphics program, that blew my mind. I remember as I was starting in my career, you’d get a new PC or whatever. You know, there are five applications that you install on that PC. GIMP was one of them. It was the thing that you installed if you wanted to do image editing. I mean, I’m just curious. What was it about graphics? I mean, when you’re a freshman in college, and you’re like, “Hey, this is what I want to do with my life,” what was it about that aspect, or images, or graphics, that had you so intrigued? What was it you were thinking or hoping you were going to do with that, or did you know?

Peter Mattis:

I did not know. I’m going to be honest. Truth is, sometimes you start these things and you don’t know.

Tim Veil:

Yeah.

Peter Mattis:

I mean, it’d be nice to say it’s like, “You just get showed the vision.” You know, the early days of what you want this to become. No. The reality was Spencer and I … I can’t remember what class it was. I think it was an operating systems class. We were taking this operating systems class. We were a little bit bored in it, and we’re just like, “Hey, let’s do something on the side for fun.” Linux was up and coming during that time, and so much work was being into compiling an operating system, those basic tools, but the application layer was a little bit spartan. You know, in high school I had worked on the journalism program. I’d had some exposure to Photoshop, and Illustrator, and some of those other tools. I’m like, “You know, let’s just do something like Photoshop.” I think I described it in those terms. I’m like, “Well, we have to start here.” We just kind of started small, and it kind of snowballed. It’s not working, and then you keep on adding things on. You’re like, “But I can do this. I can do that. I can hit this widget. I can do this kind of graphics effect. I can add this tool.” Before you know it, you sit there, and you look at it like, “This thing is actually kind of useful.” Open source was early days then, so it wasn’t like you just start doing this all in the open. They didn’t have GitHub.

Tim Veil:

No.

Peter Mattis:

We would just go on to these Usenet groups and being like, “Hey, we got this tool. Let’s-”

Tim Veil:

Where did you commit code to with that? Because that would’ve been so long. This was before GitHub. This was before everything. SourceForge?

Peter Mattis:

No, it was before SourceForge. I can’t remember exactly what we were doing. We had a friend who’d developed his own verse control system … PCRE I think is what it was called.

Tim Veil:

Yeah, I think that kind of rings a bell.

Peter Mattis:

Yeah. There was some tool back then, and we were just kind of FTPing back and forth between our personal computers, the source code, and doing code dumps like that. Eventually, when we initially shared the source code for GIMP, we were just putting it up on FPT. “Here’s a snapshot of the source code.” Put it up on FTP. Had some build instructions and what not.

Tim Veil:

Yeah. All I remember is building these laptops or desktops, and installing it. I would use it to edit images for websites I was building at the time. You know?

Peter Mattis:

Yeah.

Tim Veil:

Had to create some graphics, some banners. Where did I do that? I did it in GIMP.

Peter Mattis:

I love these stories.

Tim Veil:

That’s so funny. Yeah, I know.

Peter Mattis:

They’re great.

Tim Veil:

It’s amazing to me. I mean, like I said, that was one of the first things that stood out to me when learning about you guys.

Peter Mattis:

Yeah.

Tim Veil:

I thought that was so fascinating.

Peter Mattis:

But we moved on. You know? I never touched it.

Tim Veil:

Yeah. You guys aren’t involved anymore, right?

Peter Mattis:

Not anymore. I mean, it did happen pretty quickly. Left college and just got busy with that day job. Hardly had time for it. It was just kind of serendipitous timing that there was enough people in the community who were willing to step up and kind of take over ownership of it. I think something different would happen nowadays. Nowadays, you know, if two guys in college made such a great program, you could probably get VC funding right away.

Tim Veil:

Yeah. Yeah.

Peter Mattis:

Probably have VCs knocking down your door, and-

Tim Veil:

Probably so.

Peter Mattis:

Probably go down a different path.

Tim Veil:

One of the other things I want to talk about, just related to your history, because this comes up a lot. As a Cockroach employee myself, when we’re out talking to people in the field, people always say, “Yeah. You guys are open source Spanner,” or, “You know, you’re built on top of Spanner,” or the other thing we hear a lot. I mean, there’s always some Spanner angle. Right? Or it’s, “Your founders built Spanner,” which you and I know isn’t true, but there is some inspiration there. I thought maybe it would be good, just for the audience, to kind of clarify. I mean, you mentioned it already, but kind of what you were working on at Google, and then how did Spanner, if at all, influence kind of some of the early thinking around CockroachDB?

Peter Mattis:

Yeah. Yeah, yeah. Definitely some confusion over this. I never touched a line of code in Spanner. I worked on this system at Google called Colossus. Inside, Google has this distributed file system. They had a version of this called GFS, the Google File System. There’s a famous paper about that. There’s a version two called Colossus. I think Colossus is still in existence. It’s just contained in the extended. For the audience, I left Google back in 2011, so my up-to-date knowledge about what they’re doing … is no longer relevant, no longer up-to-date. Colossus is this kind of foundational component for a lot of the infrastructure they have inside Google. One of the systems that built on top of Colossus, BigTable. BigTable is essentially HBase’s open source version of BigTable. It’s a horizontally scalable key value store. Not transactional. It doesn’t have transactions. Didn’t have Sequel. When BigTable came out inside Google, a lot of applications started to try to build on top of it. They were semi successful. It allowed you to get web scale applications, but there’s a lot of frictions with that programming model. You saw the applications team running to these frictions and trying to work around with the application layer. Some of these workarounds were pretty significant. There is this other tool built on top of BigTable called Megastore that added transactions and secondary indexes. At some point, the designers at BigTable were like, “Wait a minute. This isn’t good. We should actually just all this into the database itself,” and that’s where Spanner started. I was there when Spanner was getting started. BigTable and Colossus are essentially sister teams. Right? BigTable worked on top of Colossus. We were working closely with the BigTable team. I did contribute some code to BigTable at some point, because there was this whole transition from GFS to Colossus, and I helped smooth that transition with some changes in BigTable. The Spanner team was completely separate. Team members from BigTable moved over to Spanner. I never interacted or did anything with the Spanner code itself, but I saw the reason it was coming into existence. It was trying to make the complicated application development easier, and we noted that. Then we saw the continued refinements of that, putting a Sequel interface on top of Spanner so it could be used by the ad serving system. That was really part of the motivation where we had started with Cockroach, which is it wasn’t just to do some transactionally consistent key value story, but it was really to acknowledge that you want to be doing this with the database where there’s database expertise, and not doing it at the application level. When people are putting all this database logic into applications, they usually do it in a way that is fragile. It’s custom to what the application needs. It often works, and it is not maintainable long term. You know, customers. I’ve seen this myself. Some genius goes in and puts transactions on top of a non-transactional database, and no one can maintain it in the future. It does something similar for indexes, so we took a lot of lessons from that. Then, also, the Spanner paper came out. That was actually part of the reason that we were able to get Cockroach Labs up and running is VCs are actually just looking at the research papers that come out of Google, and then looking to fund companies based on those. That’s a pro-tip for the audience members who are looking to found a company.

Tim Veil:

I like that.

Peter Mattis:

Just based on that I’m in a paper that came out of Google.

Tim Veil:

I like that. You know, the other interesting question that comes up a lot when we’re in the field. As you know, I work a lot of events. You know, people come up to me. They see the banner behind us, and they say, “What in God’s name is CockroachDB?” You know? “What is this name? How did you come up with this name?” I mean, I know the superficial answer to that, but I wonder if there’s a deeper story here about how y’all came up with this name, CockroachDB, to describe this new database.

Peter Mattis:

Yeah. Well, clearly, we named GIMP. We named Cockroach. We’re really good with names.

Tim Veil:

You have a penchant for naming things. Yes, clearly.

Peter Mattis:

Yeah, a penchant for naming things. I mean, I will take credit. I came up with the name GIMP.

Tim Veil:

Did you really?

Peter Mattis:

Yeah. Yeah. You know, just kicking around names. You’re building something up. Well, we’ve got to name this thing. It’s kind of important to get things named so that you can talk about it, and something that’s a little bit … I think you sense our humor there. A little bit edgy.

Tim Veil:

Yep.

Peter Mattis:

Spencer was the one who came up with the name Cockroach. I think we were sitting around his dining room table just kind of sketching out the idea for this thing. You know? We want it to be scalable. We want it to kind of be instructable, and we’re just thinking. I don’t know where it came from exactly, but he was like, “We’ve got to call this thing Cockroach. You know, those things are going to survive the next nuclear holocaust, and we want this thing to be indestructible and unkillable, so CockroachDB.” Then, as soon as he said it, it just kind of stuck. We kind of started spreading the name. You know, the initial reactions were mixed, but pretty universally you don’t forget the name as soon as you hear it.

Tim Veil:

For sure. I’m curious, I’d-

Peter Mattis:

We love that aspect.

Tim Veil:

I mean, I know out in the field when we’re talking with customers, oftentimes it’s when we explain the reasons why, which you kind of did, people’s eyes light up, and they totally get it. We have had a couple instances where people simply don’t want to deal with the name. You know? That it freaks them out, or they want us to change the name on paperwork, or something like that. I’m curious. When you guys first started to kind of take this around to VCs, to kind of some early prospects, or whomever, investors, what was the reaction? Were people kind of, “Ew,” or was it an immediate, “Hey, this is great.” I’ve never heard a description of what some of that early reaction was.

Peter Mattis:

Yeah. I mean, I was super surprised. Spencer and I did this initial fundraising round right when Cockroach Labs got formed. We took a tour of the Bay Area to all these nice VCs, and not a single one blinked an eye at the name.

Tim Veil:

Really?

Peter Mattis:

It was just incredible. I kind of thought we’d go up there, and we announce ourselves to the secretary, “Hey, we’re Peter Mattis and Spencer Kimball from Cockroach Labs.” They’re just like, “Okay.” You know? Nothing. There was nothing at all, so the VCs were just nonplussed by the name. Several of them commented that they really loved it. You know? It was very memorable.

Tim Veil:

Yeah.

Peter Mattis:

We got that. I do remember distinctly, though, maybe a couple of months, or a year in, we started actually doing tech talks at companies, trying to spur some initial interest in CockroachDB. I remember giving one early days, and afterward an audience member came up to me, and he was like, “Well, this is really interesting, what you’re doing, but I guarantee you no CIO will ever sign a check to Cockroach Labs. You just have to change the name.” I was like, “Okay. No, we’re not going to change the name, but okay.” You know, fast-forward another five years, and CIOs seem to have gotten over their reluctance. I don’t think that guy knew what he was talking about.

Tim Veil:

Yeah, I-

Peter Mattis:

If you’re out there listening-

Tim Veil:

Definitely don’t think that that guy knew what he was talking about. No. I think we haven’t had that problem.

Peter Mattis:

Yeah, but we have heard repeatedly. There is some very strong negative reactions from a very small segment.

Tim Veil:

You know, I’ll say this. You know, having, again, spent a lot of time out in the field, it is so unique and differentiated. You know, although people aren’t necessarily, at first pass, always going to understand what we do, once you give the quick explanation, it absolutely and totally resonates. I mean, people get it. You know? It’s like their eyes light up. They’re like, “Yes. Yes, I like this.”

Peter Mattis:

Yeah. Then we actually are softening the term over time. I mean, I don’t know what your perception is of cockroaches anymore when you hear the term, but I have a generally positive connotation towards that term nowadays.

Tim Veil:

I do.

Peter Mattis:

And I think that’s true of many people at Cockroach Labs, and some of our customers, too.

Tim Veil:

Well, there’s no shortage.

Peter Mattis:

If you have a fear of cockroaches, you should come work here, because we will actually kind of-

Tim Veil:

Well, of course, because we use it everywhere for everything. You know, no company event or title goes by without somehow making the tie in.

Peter Mattis:

Yeah.

Tim Veil:

But that’s okay. We love them, and it’s been great. So just going back to kind of the database itself. You know, it’s been out in the wild for, what is it, eight years, I think? We just had our eight-year anniversary. You know, we’re adding new features, new capabilities, all the time. If memory serves me correctly, the database that you had in the first couple years looks a lot different than it does today. Can you kind of talk a little bit about what some of those big changes were, what some of the differences from the early days to today are? I know, for example, Sequel wasn’t always a thing. There was some debate about that, about the nature of Sequel being used. Can you talk a little bit about what were some of the early designs or thoughts about CockroachDB when you first got started?

Peter Mattis:

Yeah. Yeah. Well, I mean, you’re absolutely correct. I think we’re known as a distributed Sequel database today, and we put a lot of hard work behind that. We did not start out that way. That initial road show with investors, we were talking about being a distributed horizontally scalable transactional KB database. We thought we were going to provide a KB interface to users. We were, I think it was six months, nine months, into the lifetime of the company trying to figure out exactly what this KB interface. We had initial prototypes for what it’d look like. Figure out what that KB interface would look like. As we’re looking at this more and more, we’re just like, “Wow.” You know? We want to make this thing rich and powerful. You know? Doing something new there has a lot of overhead to get people to adopt it. We decided to go with Sequel. Initially, we made kind of a two-step decision there. We decided to go with Sequel, and then we were trying to figure what dialect of Sequel do we want to do. We decided to go for Postgres, the thought process being they’re a little bit more adherent to the Sequel standard than MySequel. That was momentous. You know, as soon as we made that decision, we took on a huge chunk of additional work, but we’re taking a huge burden off application developers.

Tim Veil:

Absolutely.

Peter Mattis:

That was just right in mind. You know? Our investors were super excited about this. Early customers were, though they were also like, “How are you going to do all this stuff?” They didn’t see the path where we were to where we are today, which it’s a very significant path. You know? You have to build up this kind of Sequel execution engine. We had to add a Sequel optimizer. It implied a lot of things, and there’s significant challenges we had to overcome. That was probably the biggest, most momentous decision that we changed path on. There’s just a ton of other stuff that has evolved over time. You know, our backup restore system, the addition of change data capture. I mean, change data capture, when we were getting started that wasn’t on our radar, but it’s clearly super important. Then, of late, one of the big things that’s really evolved over the past couple years is just the importance of migrations, making it easy to get existing applications onto CockroachDB. This comes in in a whole bunch of different flavors. I mean, we’ve done work on Postgres compatibility for a long time. That’s a huge impact on migrations, support for drivers, and ORMs, and other development tooling that programmers like to use. But it’s deeper than that. It’s like, does the application have to change at all? For some things, there’s certain fundamental things that you kind of need to change it when you’re moving from a legacy single mode database to a distributed mode. I’ll give you one example of this. If you are using sequences, the sequence type, that’s the single hotspot, and any distributed database is going to struggle with that. You’re going to have a limitation on how far you can scale it. If you can replace it with something like a UID, you’re going to be able to get much higher throughput out of your distributed database. That’s one example of something that feels a little bit fundamental, and yet even that one we’re still working to make it better. Now we have hash charted indexes so that we can try to spread across that right hotspot. Everything I think we’re doing, that we’ve done to date, we had this original mission when we formed Cockroach Labs. I really love that mission statement. It’s very broad, very ambitious. It’s just, “Make data easy.” We want to make data easy for our users. We moved away from that mission. I still love it. It’s just super simple, and also a guiding principle.

Tim Veil:

Well, I think it’s amazing. I wonder, too, when you guys first started out, obviously the product has grown in complexity. Did you have a sense of how? I don’t want to make it sound negative, but all the things, because you just listed a lot of stuff. There’s even more to the product, obviously. Did you have a sense of how rich this product or future set was going to need to be in order to compete in the space? You know, when you first started out? Because to compete against some of the largest companies in the world who are building database products, you need to have just this very wide-ranging set of features. I mean, was that part of kind of the early thinking, like, “Hey, we’re going to start here, but we’ve got this mile-long list of features.” Did you ever really think that far out? I mean, I’m just always so curious about people’s mindset when they get started. You know? Because you’re eight years into it now.

Peter Mattis:

Yeah. No, I think that’s a great question. Honestly, no, we did not. I think that’s something that is a useful attribute in a successful founder, that you have to be smart enough, bold enough, to be able to set something in motion, get the foundation in place, but not so smart that you see exactly how much work it’s going to be. I mean, if you’re standing at the base of the mountain, and you look up, and you actually can see to the peak, and see how high Everest is, no one would ever climb it. You just start putting one foot in front of the other. There’s this phrase I like, “Mountains beyond mountains.” I feel that that’s the start of a journey. You know? You can kind of see the mountain in front of you, and it’s intimidating, but you can see a path to the top of it. You get to the top of that mountain, and you realize there’s another mountain beyond it. You just go, “Okay. I actually can see what I need to do. I’m going to start walking down that, and accomplish that,” and there’s another mountain beyond that. Right? You know, we’ve got to this point now. You know, we’ve built this. We’ve summited many of these mountains. You know? We have this distributed horizontally scalable Sequel database. It actually works for mission-critical workloads out there in the world, and yet there is additional stuff. The migrations stuff I was just touching on, that is a mountain we’re going to be climbing for years. There’s just so much work to do there to make it seamless and easy. You know, not just enabled the migration, but doing it with zero downtime, for example. Live migrations. That’s the thing that keeps me going still. I just keep on seeing the next mountain and being like, “I see a way to accomplish that. I see-”

Tim Veil:

Do you think we’ll ever be done?

Peter Mattis:

No, I don’t think we’ll ever be done.

Tim Veil:

I don’t think so either. I don’t think so either.

Peter Mattis:

We, Tim and Peter, might be done at some point, but Cockroach Labs will never be done.

Tim Veil:

Always a long backlog of features, for sure.

Peter Mattis:

Yeah … I kind of saw some of what was going to be needed for the fundamental capabilities of a distributed database. The thing that I had no visibility to is just the amazing amount of integration you have to do with the enterprise ecosystem. This comes up all over the place. Security is a huge thing. How much security encompasses the compliance standards you need to adhere to. Observability integrations, the ORM integrations. I didn’t realize how long that tail of functionality was going to be.

Tim Veil:

You know, I agree with you. Of course, as you know, I mean, we’re on the front lines of it and hear about a new thing every day that people want to do or use with Cockroach. I think we’re solving a fundamentally important problem that has been, I think, traditionally very difficult for other companies to solve. I read an interesting article, and I’m curious about your perspective on this. You know, this idea that Moore’s law is dead. You know, that CPU power is kind of flat lining, if you will. You know, it’s not doubling every year. The argument that was being made there is that whereas traditional databases you could easily kind of swap out your main database node or server with a CPU that was much, much faster to kind of continue to eke out performance, you simply can’t do that anymore. To really keep up, ultimately, with modern workloads and modern expectations, you need to be able to scale horizontally your database. You know, you can’t just any longer rely on Moore’s law to get you additional performance from the same old box. I think what’s interesting about where we are as a company right now, and what we’re doing, is that I feel like all these market forces are moving in our direction. You know? Distributed databases are absolutely a thing that people need. You know? Being able to run across cloud providers turns out to be a really, really powerful tool. Sequel is a language. Turns out to be incredibly popular and powerful right now. You know? It seems like all of these things that I think you and Peter, or Ben and Spencer, kind of thought, “This might be a good idea,” eight years ago, turn out to be, I think, right at the heart of where people need to go right now. I’m just curious on your thought of the timing of it all.

Peter Mattis:

Yeah. Better to be lucky than good, I guess, is maybe something I’d think about. I mean, I think some of this was we had foresight into. Certainly our experiences at Google gave us some foresight that horizontal scalability was extraordinarily important, and that there are limits to the vertical scaling you can see of a single machine. This is more Moore’s law on CPU. This is the amount of RAM you can pump into them, amount of disk you can put onto them, and then there’s also just kind of fundamental reliability aspects if you put that much hardware into a single machine. We saw the horizontal scalability was going to be extremely important. Yeah. The Sequel stuff is maybe a little bit surprising to me. I didn’t know how much life Sequel would have. When we started looking at that, I was on the forefront of looking at Sequel versus alternatives. It was very clear. Cassandra was big. You know, HBase had some traction, but Sequel dwarfed it all by two orders of magnitude. Instead, there was definitely foresights like, “This thing is there. It’s not going away. It’s being enhanced, and people are extending it.” It’s only continued to be that way. I think at the time that we were starting Cockroach Labs, NoSQL was still a hot thing, but we were actually on the tail end of that, I think. I mean, NoSQL isn’t going to go away, but Sequel was resurgent at that point.

Tim Veil:

Yeah. I’m very curious about that, because it’s interesting. You know, when you guys were kind of meeting with those early investors, I think you’re right. I mean, NoSQL was very much the thing. Was there any concern at the time that this was going to be too crowded of a market? I mean, I’m trying to remember what else would’ve been out there at the time that was pure no NoSQL. Certainly, for me, now Sequel is such a big differentiator, but then you guys didn’t have it. What were the kind of hallmarks of the differentiation back then, if you can recall?

Peter Mattis:

Yeah. I mean, some of our early investors, they were familiar with the Hadoop Ecosystem. They saw how it had grown. They saw how it also was running to its limitations. Cassandra was out at that time. React was also out at that time. Yeah. I mean, the differentiators we were originally going into, though, is basically … Our pitch, to some extent, was, “Hey. We’re going to be the open source Spanner. You know, you saw how BigTable, and GFS, and MapReduce had spawned the Hadoop Ecosystem. Well, Spanner’s also going to spawn it.” The thesis was pretty easy for investors to wrap their heads around, that you want to have this. By the way, Spanner, at the time, was also a KB system. It added Sequels later on, so they were just looking at that like, “Ah.”

Tim Veil:

Okay. I guess I didn’t realize that. I thought they were always Sequel.

Peter Mattis:

Yeah, no. No. Actually, the Sequel got added in later, but the cloud Spanner has always been Sequel, has always had Sequel. The internal version just had a KB version, and then there’s a separate application team for the add where it’s actually put Sequel on top, and then they blended back in over time. Really, our initial pitch was like, “Hey, we don’t want to burden application developers. NoSQL puts a burden on them. You lose transactions. You use indexes. You lose this rich query language. Well, we want to put back in transactions and indexes.” We feel that these are things that database engineers are best suited for implementing.

Tim Veil:

Absolutely. One other thing I wanted to touch on. Well, we may touch on many things, but the only kind of big, I think, monumental change, at least from my perspective. You may disagree. I mean, certainly introducing Sequel was a big change, I think, in the direction of the history of Cockroach. What many people also, I think, have heard about at times and said, “There’s something to do with RocksDB.” You know, RocksDB is this underpinning of Cockroach. You know, at some point over the last couple years, we made a move away from that to something we call Pebble. I know you were very, very much involved in that decision and engineering process. Can you kind of walk us through a little bit of the history of RocksDB, and how and why Cockroach is using it? Then what happened when we decided, or what initiated the thinking behind moving beyond that?

Peter Mattis:

Yeah. Yeah. RocksDB is this low-level key value storage engine. What I mean by low-level is it’s really for a single machine. This is how CockroachDB initially was storing data on disk is using RocksDB. RocksDB doesn’t have transactions, but it has diversity and consistency and durability guarantees, so we were building CockroachDB on top of this. You know, to be clear, it’s not that CockroachDB is RocksDB with a small bit of glue. Actually, the glue was very significant, but RocksDB was an incredibly important component of that. RocksDB is fast. It comes from Facebook. Its pedigree is, actually, there’s an open source software from Google called LevelDB. The RocksDB folks took it, forked it, and made RocksDB, and then added a whole bunch of optimizations and enhancements for functionality. We were relatively happy with RocksDB. You know, when you’re building a system as complex as CockroachDB, you don’t necessarily want to build everything from scratch. RocksDB was one of the big components we were deciding to use. Another big component, we decided to use GRPC for communication between nodes. You know, there’s several other big pieces of open source software that we were built on top of. There was a little bit of an impedance mismatch with RocksDB, though, because CockroachDB is written in Go. We like that choice. You know, some grumbles on the edges with garbage collection, but we’re generally very happy about that choice. RocksDB is written in C++. In order to bridge the two, you have to use this functionality from Go called See Go. This implies some fundamental overhead in doing certain calls, and we mostly worked around them, but it was a bit of a headache. Our interface between the main CockroachDB code and RocksDB was fairly narrow, and very tailored to be heavily optimized, but that made it somewhat inflexible. We looked at some point. You know, A, we didn’t have expertise for maintaining RocksDB. Knowing the C++ code, we could’ve built that up over time, but we had to have engineers transitioning between Go and C++. At some point, we were like, “No. Actually, we think we can do this ourselves betters.” Essentially, you can look at Pebbles. You know? Almost a fork of RocksDB. You know, essentially it was RocksDB stripped of all the things that we didn’t use at CockroachDB, and reimplemented in Go. It was great to see that we could actually get something as performant or actually faster for CockroachDB written in Go, and removed all this kind of impuse mismatch. I’m talking about the transition between Go and C++. Then, since doing that, we’ve been able to keep on extending it with functionality that is very much tailored to CockroachDB that wouldn’t necessarily be useful to try to uptrain back into RocksDB.

Tim Veil:

Yeah. I think it’s been a huge win for us all around. Of course it shows, again, the Cockroach Labs' team penchant for naming thing. You know, RocksDB, but want a smaller, more compliant version of a rock, which is Pebble.

Peter Mattis:

That is exactly where the name came from.

Tim Veil:

We love this stuff. You know, you touched on another interesting topic I kind of wanted to get your perspective on. You know, this kind of lineage between RocksDB and Pebble, obviously borrowing or using some of this code initially. You know, we’re a Postgres compatible database, but oftentimes in the field, or just maybe a casual observer will hear Postgres and think we are a fork of Postgres. You know, that we’ve done something to the Postgres binary and added some features to it. Can you talk a little bit about that, and some of the reasons why we didn’t go down that road, perhaps? I’m just curious. You know?

Peter Mattis:

Yep.

Tim Veil:

Your thoughts on this topic.

Peter Mattis:

Yeah, yeah. No, we are not a fork of Postgres. We do not take any code from Postgres. I think there might’ve been a couple of occasions where we tried to reverse engineer precise semantics of Postgres, but we’ve done that very light code spelunking. Not anything where we’ve actually taken and forked code. Part of the reason for doing this, the underlying storage engine, the capabilities of our KB, are fairly different than what Postgres' are. Trying to map the Postgres code onto that, it kind of felt more difficult than doing it. There was another aspect to it, too. We weren’t committed when we initially did Sequel to being fully Postgres compatible. That is actually something that, if I could go back and do it again, I would’ve just made the mandate initially. We were just going to be perfectly Postgres Sequel compatible. We said for the Sequel to be implemented, we want it to be compatible with Postgres, but not necessarily doing the full kit and caboodle. That was one of the lessons. I really loved having made that decision to adopt Sequel. I really loved adopting Postgres Sequel as our dialect. Just should’ve made that decision right upfront, just be fully compatible. It would’ve removed a lot of headaches we had later on with regards to compatibility. You know, we’re nearly at the tail end of this now. I look at this every now and then. We kept on chipping away at our compatibility. We’re nearly done with it, and we’ve been to a sufficient level for years, but I think we could’ve gotten there a lot faster if we just upfront said, “No, this is the goal we’re shooting towards.”

Tim Veil:

You know, you kind of touched on another interesting topic which you and I have talked a little bit about before. I mean, you know, these kind of idea of lessons learned. That’s a great lesson learned, I think, when you look back at the eight years of history with Cockroach. Are there other things that jump out to you, like, “Jeez, I wish I had known then what I know now.” You know, and it doesn’t necessarily have to be technical. Maybe it’s just about leading an organization, or all sorts of other stuff. I mean, you’ve had such an interesting role in the evolution of the company and the product. I mean, any other thoughts come to mind about lessons learned?

Peter Mattis:

Yeah. Yeah. You know, one of the ones that strikes me, I always think about this nowadays because we’re still dealing with it a little bit. We’ve added a lot of multi tendency features to Cockroach of late. One of the things that this has enabled is a very clean separation between user data and system data. I wish we would’ve made that separation much more clear upfront. I mean, the multi tendency is great, because it’s enabling surplus functionality. We’ve had to retrofit this onto the system in order to get multi tendency deeply embedded, and it would’ve been much easier to do it from the get go. This separation between user data and system data, which isn’t quite tied to multi tendency. That is kind of a separate thing, but it’s just the realization. User data from the perspective of an operator as we’re operating CockroachDB in the cloud is kind of “toxic.” You don’t want anybody inside Cockroach Labs to see this. We’ve had to go through and ensure that we’re redacting any mention of user data from operator views, but to leave the systems still operable, observable, debuggable, while redacting all that user data. It’s taking quite a bit of effort. It’s just one of those things if you had known upfront about this, and known this is a core principle that you need to be adhering to, it would’ve just been a lot easier. I think the other thing that we had some early feedback on, and I kind of think that our call here was right, was just about should we be adopting the cloud, or being, perhaps, even cloud only. I think if we were starting today, we would have to fight a lot harder to make it so that CockroachDB was available self-hosted. I’m happy that CockroachDB is self-hosted, but the world has moved on. It’s kind of clear that you can build cloud only software service companies. You know? Snowflake is a primary example of this. Completely issuing the self-hosted group just to focus on the cloud. We didn’t get that. I think it was the right decision, but I look back on it. It’s like a couple years difference and we probably would’ve made a different decision.

Tim Veil:

Well, but we are doing some interesting things in the cloud, I think. You mentioned it, but I’m curious, or would love to kind of get your perspective on not only kind of where the inspiration came from, but, to some extent, where we think we’re going. That is this concept of serverless. I know serverless can mean lots of different things to lots of different people, kind of it’s a term that comes with its own set of expectations. But we do have a database that we’re calling kind of our serverless product. Can you kind of share with our listeners what this is, and maybe how it’s different from conventional definitions of serverless, or not?

Peter Mattis:

Yeah. I’m not even sure what the conventional definition of serverless is anymore. It means so many things to different people. I mean, when we think about serverless, I really think about not having to worry about the database servers. There’s clearly database servers on the back end. You don’t have to worry about them. This is serverless in the way that S3 is serverless. You’re storing data in S3. What servers do you see there as a user? You don’t see it. They’re just providing guarantees around what the API is going to be, the durability, the response times, and you’re not having to worry about the servers whatsoever. We think this is core for database users as well. I mean, I think there’s some database users that haven’t kind of caught onto this. They’re thinking like, “No, I need to worry about I have servers. I have them in this location.” You know? But, really, that provisioning headache of what servers in particular, how do you add new ones in response to changes in load, or workload, or what not, that’s a painful, painful thing to be doing. I have a dream. I have a dream that, at some point in the future, application developers never have to worry about their database servers. They’re just worrying about, “I have this database API in the cloud. It’s Sequel. I can set guard rails on how much I’m going to spend on this, and it will scale with me. It will scale with me to the geographical locations I need. It’ll scale horizontally within those geographic locations.” You just completely get rid of that concern about servers. The other aspect of serverless for us, though, is we want to scale down to incredibly small sizes. The implementation here is kind of interesting to understand. Essentially, there’s still servers underneath. There’s always going to be servers, but you just get a virtual slice. You don’t have to worry that I need these physical servers in this physical location. No. Because it’s virtual slice, it can scale down so that, if you’re not using it at all, you pay close to zero. You know, depends on how much data you’re storing. We might charge you some, but it’s close to zero, and it elastically scales up very responsibly under demand, which is also an incredibly important aspect. This isn’t serverless per se, but this is what serverless enables, is you just charged for what you’re using, and that means adjusting to load during the day. So many applications have peaks of load during lunchtime, or dinner time, or when a news story comes out. Elastically scaling up to meet that node, and then scaling down when it’s not being used anymore, can make your system a lot more cost-effective.

Tim Veil:

Yeah. I think it’s some really powerful concepts. I think, and correct me if I’m wrong, but kind of the underpinning of this, or at least part of it, is this ability of Cockroach now to really separate kind of compute from storage. By compute, I mean Sequel. You know, you kind of have this idea in the serverless deployment model of the kind of Sequel nodes which are distinct and separate from the underlying storage nodes. That’s kind of one of the things that you don’t really have with our more traditional offerings that you do have with serverless is these things become different, can be scaled independently. Ultimately, those Sequel nodes can kind of go down to near zero, but your storage remains. Then the ability to spin back up Sequels is kind of a neat and interesting thing.

Peter Mattis:

Yeah. Well, you absolutely nailed it there. I have nothing to add.

Tim Veil:

Great. Perfect. I know we’re kind of running up on time, and I don’t want to hold you more than the time allotted. As you well know this, well, it’s spring. Right? The flowers are blooming. But it’s also kind of the beginning of the Cockroach Labs fiscal year. It’s always a time, at least for me, to kind of reflect or think about kind of some of the things that are coming in the near future, or the year ahead. Just curious. What are some things that you’re kind of excited about as we kind of start a new year, new fiscal year? What has you excited? I mean, you’re going into your eight, ninth, year at Cockroach Labs. What’s going to make this year kind of the best year in the history of the company?

Peter Mattis:

Every year is the best year in this company.

Tim Veil:

So true. So true.

Peter Mattis:

We’re still here, and we’re kicking. Yeah. You know, I’m excited. The multi tendency stuff that we built with serverless, I think we’re going to start seeing that kind of trait to other areas. We’re seeing companies experiment with this on their own with CockroachDB. You know, on self-hosted. They’re doing it themselves. We’re like, “Wait a minute.” You know, you guys can actually start using this functionality we built right in there. You know, we built in a lot of isolation facilities there directly into CockroachDB. Those are all coming to fruition and being spread throughout so that applications running on the same database, they don’t interfere with each other. Background operations don’t interfere with foreground operations, and much steadier latency guarantees so that when something goes awry. Things go awry. This isn’t necessarily the database itself. Sometimes it’s the application. I mean, we’ve heard multiple times of an application getting rolled out. It forgot to run index. This is doing a full table scan. You know, be able to quickly observe that. Our observability tools have gotten better. There’s still a ways to go, and we have things coming up that are impactful there. Come a little bit longer term there, too. I mean, what we can be doing on workload recommendations, telling you when you have a missing index, you have unused indexes. That stuff is there now. It’s going to get more powerful over time. Going to make increasingly intelligent suggestions there. I love that stuff. Then, also, we’re starting to see back on the cloud side, serverless and our dedicated offering there. Just increasing traction with customers. We’ve built a ton of functionality that enterprises need to be using this stuff, and now I’m super excited to see these enterprises start to use the cloud offering. Get all that CockroachDB goodness in the cloud package.

Tim Veil:

Well, Peter, it was a pleasure as always to chat with you. Thank you so much for spending a few minutes today, and sharing a little bit of the history of you, the Cockroach Labs, CockroachDB, and ultimately where we’re headed. Again, thank you so much for joining Big Ideas in App Architecture.

Peter Mattis:

Well, thanks for having me, Tim. This was great.

Big Ideas in App Architecture

A podcast for architects and engineers who are building modern, data-intensive applications and systems. In each weekly episode, an innovator joins host Tim Veil to share useful insights from their experiences building reliable, scalable, maintainable systems.

Tim

Tim Veil

Host, Big Ideas in App Architecture

Cockroach Labs

Latest episodes