Episode 23
A historical journey in developer technologies
Mike Willbanks
CTO at Spark Labs
Never miss an episode
As the CTO at development agency Spark Labs, Mike Willbanks gets the opportunity to work with every technology under the sun. This episode is a free-flowing conversation about the trends that have defined the modern developer experience as well as what’s next.
Join as we discuss:
Mike Willbanks:
Keeping a backup just in the event that that company completely goes away, or you get the million-dollar mistake that we all sometimes make in our career of dropping a production database or deleting the entire hard drive or something.
David Joy:
What is up everyone? And thanks for tuning in. Today’s episode of The Big Ideas in App Architecture podcast, we speak to Mike Willbanks, who is currently the CTO at Spark Labs. We get into Mike’s passion for development and technology and how he got into tech to now running and being a CTO at Spark Labs. We covered topics around design patterns to how databases have evolved and challenges with developing applications right and a whole lot of generative AI and how it needs to be embraced by users, such as developers and companies. So pump up that volume and get ready for a really intriguing and passionate conversation with Mike Willbanks.
Welcome to the podcast, Mike. It’s exciting to have you on the podcast and I know you’ve been really busy, but you took the time to come. I really wanted to say thank you for taking the time off and we are going to have a great conversation.
Mike Willbanks:
Yeah, thanks for having me. Very excited about it.
David Joy:
All right, that’s awesome. So as we kick it off, Mike, I mean I thought the first thing before we get into what you have done in your past, tell us a little bit more about Spark Labs and what you guys do, what your current role at Spark Labs is.
Mike Willbanks:
Yeah. So, Spark Labs, we are essentially a development agency, but we also kind of operate a little bit differently than most of the other agencies out there in that we also come up with some of our own products, bring those products to market, and work a little bit more like an incubator inside of that side of our business. And then likewise, we’re a software development agency, so we’re constantly working with several different clients, building out various different tooling technologies. And so a lot of our ideas sometimes comes from that as well of just like, “Hey, I think there’s a hole in the market here. Maybe that’s something we could plug up,” or, “We always build these same things over and over again. Why don’t we just make a solution out of it, so we just don’t have to keep doing the same thing over and over again.”
David Joy:
You’ve been in the CTO role now at Spark Labs, but before you were a CTO, but all throughout your career you’ve been a programmer, right?
Mike Willbanks:
That’s correct. So, yeah, grew up programming, got into programming, still love programming, and then I kind of got into engineering management and I love engineering management, but somehow I always find myself back behind the keyboard working on solutions, building different things. And no matter how much I try to get out of it, no matter which company I go to, they always find a way to get me back to coding at some degree or some level. So yeah, it’s interesting.
David Joy:
And I think you should always stick to your passion too, while you’re making a change, you should never forget your roots and what you’re passionate about, which is development, which is awesome. When we spoke for the last time, it was such a great conversation. I thought part of that conversation was a podcast that we had when I went back and heard that. But I was excited because you have really good opinions and you were saying you’re a highly opinionated person about technology and some of the things that I wanted to talk to you about in the podcast was around things that you have different opinions on. But before we get into that, I really wanted to go back and ask you about how you started and got into all of this. Talk us a little bit about, you’ve had a very storied career for 20, 30 years doing and working on amazing technology, but how did it all begin? Where was that first spark in somebody like you who led you to get to something like Spark Labs?
Mike Willbanks:
Yeah, I mean it’s kind of a fun story, interesting story I guess. I always just had a knack for computers. From the second my dad had his first one, I taught him how to use his first laptop. He brings this laptop home, he has no idea how, he’s barely used computers before, I’ve never touched one before. And I see this thing, I’m like, “Oh, hey, look at this.” I’m like, “Oh, you can go do this and it’s easy, it’s intuitive.” So just right from the get go, I just had a little knack for it.
And as time got on, I was just super interested in technology as a whole and some of the different games that were out there, just learning everything. And internet came, so that was the real cool stuff and I’m like, “Hey, how do I get,” my dad had a business line in the basement so it wouldn’t plug up the main line for dial up. And I was like, well, you got all these CDs with all these free minutes, so the thing after that was, “Okay, well how do I get it?” Because first off, you used to just be able to get another CD and just sign up again and it would let you do it, but then they started kind of blocking it off by your computer’s MAC address. So then I figured out how to manipulate the Windows registry to get free internet so I could just keep resigning up, just different MAC address, and another 900 free minutes.
So that really got me into my forte of just the internet and computers back when webpages, Netscape, browser wars were going on with Internet Explorer, and we haven’t heard the name Netscape forever because Firefox now exists. But yeah, I guess that kind of really got my entrance into things. And then I just started getting really interested in like, “Hey man, I want to build a webpage. I wonder how you do this. How does this work?” So I started just learning from that perspective and taught myself Linux way back when we first got cable internet and back then, you couldn’t just go and download most of these things. So I had basically gotten Linux before that had happened and kind of toyed around with it, but lack of any internet didn’t really work for me. And so, internet came through, we finally got cable, and I’m sitting there talking to the guys in the network operations center trying to figure out, how can I get my network card to talk to you guys?
So about three, four months later, I finally figured it out, a whole bunch of different news group servers like, “Hey, I have this card, I have this distribution, why won’t it work?” And I’m like, “I’m using this ISP.” They’re like, “Oh yeah, there’s a strange setting and only certain network cards work.” And so, yeah, I kind of get down that road and just start teaching myself HTML. At first I thought I needed Front Page, this is how bad it was. I was like, “Oh, you need Front Page to develop websites.” And man, that would’ve been a really bad move to go down that route. But just started learning from scratch, what I could find off specification websites and just playing around. And I was kind of in high school and got into this program where it was more of a graphic design thing and had built out a band membership page that was actually getting pretty popular.
Found out how to register a domain name. Back then you couldn’t just go to a website, type in your domain and you get. You had to fax internet with what you wanted, and then they snail mail you a response and you might not know for 30 to 60 days whether you got your domain or not.
David Joy:
Or not. Yeah, wow.
Mike Willbanks:
And I mean, you could modify your name servers and stuff online, but you couldn’t register it online. And so, I kind of figured that part out. And then there’s these hosting companies, and at first it was shared hosting and virtual machines weren’t really a thing yet. They were bean counters, so you’d get your own little home directory and run your little site, and those didn’t really work well. They were always super buggy. At that point, I’m like, “Oh, I need a bulletin board for my community.” And so then I put up this bulletin board software that was written in Perl, and so now I’m learning Perl because I want to customize it.
Back then Flash was kind of the super cool thing, so I start learning Action Script and Flash and-
David Joy:
Oh my God. Yeah.
Mike Willbanks:
… we all know how bad Flash ended up being, right?
David Joy:
I [inaudible] Flash. Oh my God.
Mike Willbanks:
Yeah. And then also I’m like, “Hey, there’s programming language PHP,” and that really jump started my career at that point, and it was back in the days of PHP 3, right before they kind of went to 4. So I was learning a little bit more of the modernization ones, modern languages, and really got into it, started actually reading the source code, reading the manuals, figuring out all the different things, how do I lock things down? How do I make things so they’re secure? Because at that point in time, a lot of us didn’t really care about security. You could basically hack anything you wanted at that point because it really wasn’t all that hard-
David Joy:
[inaudible]. Oh yeah, it was so much more…
Mike Willbanks:
We stored passwords in…
David Joy:
Yeah.
Mike Willbanks:
We stored passwords. Like MD5 was awesome encryption, and it’s like, “Well, it’s not encryption. We’re really just hashing it, but whatever.” And then we start learning about bcrypt and stuff like that. So as time went on, I learned all these tools, started a company with my father at that point, started going to school, ended up dropping because I was at a startup where we were just going crazy, we were building websites for real estate people, had a whole software system built out, two different owners, company breaks up, I have to start it all over from scratch.
And it kind of just took off from there. I mean I ended up starting, went to a software development agency after that, a book publisher after that, all sorts of different fun little career moves and no particular, I never specialized in one category. And so I had always been like, “Hey, I’ve always done everything. I know systems. I know how to program. I know databases. I know how to build a website.” Back then we didn’t have full stack developers, so I was a full stack developer before there was a definition.
David Joy:
Yeah, I was going to say that, sounds like you were a full stack developer way before full stack developer was a thing, definitely, yeah.
Mike Willbanks:
Yeah. I used to tell people, nowadays, they’re like, “Oh, you need to know so many different languages.” I’m like, “Yeah, that’s true. But when I started, you had to know how the entire networking stack worked. You needed to know how to operate. The programmers would tell system administrators how to configure their programming language.” People now just expect it to work.
David Joy:
I mean there’s so much that has changed in 20, 30 years, right?
Mike Willbanks:
Oh, yeah.
David Joy:
And when you were saying that, it’s like anybody who’s listening who’s been through that era, I went into engineering 2006 I believe, but I was doing computers since 2002, so it was right after the dotcom era bubble bursting, and then I hated flash and I was just testing out things on my own, so I could relate with some. And then there was PHP and we would go to W3 schools to learn how to do HTML coding. We were talking about that era, the internet was American Online and we were trying to explore, but Internet Explorer was there.
And from what you were saying, I was just thinking, that’s an era that we have not experienced. New developers or younger generation of developers or new people are not going to experience that, the struggle to find the right documentation to look for an answer is… That was the era of trying to look for a problem or a data point, and Google Search came in at this time, took over, took off, and started helping out and people realizing we really need to index things properly so people can find us and things like that. So I feel like it’s exciting how things have changed. And this last 12, 13, 15 years is probably, if you were a programmer coming into the industry now, the kind of things you were looking for, how would that be for you? Everything is available right now, the documentation, community Slacks, Stack Overflow, what do you think about that?
Mike Willbanks:
I think it’s crazy. I mean the hard part now is just that there is so much material and who do you listen to? Where do you get involved? Do you attach yourself to the language, the framework, a specific technology? Where do you go with that? I mean there’s so many different things.
David Joy:
Right. So let’s dive into that. I want to know, how do you make that decision now? So let’s get to the now. We had this phase where you would use what was available to you, and then now we have this plethora of multiple languages, frameworks, databases. We’ve gone from NoSQL to SQL to distributed database-
Mike Willbanks:
There’s so much.
David Joy:
My bad, SQL to NoSQL to distributed SQL. So we’ll get into the database part maybe later, but how do you go into making design decisions for use cases that you have, especially now that you work at Spark Labs in the capacity of engineering manager/CTO, right?
Mike Willbanks:
Yeah, I mean there’s a lot of different options out there and there’s a lot of different architectures and each one of them has their pluses and minuses. And so we’ve basically standardized, on a particular stack that we like, that works well for us that a lot of our people work well at. And most of our decisions have come from just my background and my history and influencing some of those. And I’ve put the kibosh on a lot of different things over the years, but I guess it’s going to make more sense to just give a small history of what I’ve done throughout my career to pre explain what I’m about to say.
David Joy:
Yeah, yeah, go ahead.
Mike Willbanks:
I love to look at it from a system standpoint because if we start off with the cold call computing era, it’s a little bit easier to understand when we’re looking at a dead system, if we’re just looking at the Linux OS and how we’re going to run that. So first, we traditionally started off with things like bare metal computing, which is just, here is your machine. There wasn’t a VM. Their VMs later that came on were called bean counters. They just kind of counted how much resources you were using your user and tried to limit you.
After that, we kind of got into KVM and all of the cool virtual machine technologies that we have now. Then we got into containerization, which is kind of the Docker and Kubernetes side, and now we’re kind of into that mode of functions as a service, which is our AWS Lambdas and all this Serverless technology that we’re using today. And so I’d like to say I applied these same principles to how I approach development as a whole is that if you were to take that whole systems era and you skipped a generation, you really weren’t missing too much.
I mean, yeah, we skipped bean counters, who cares about those? Those are basically dead. Now you have the KVM side of it, which is what most modern virtual machines were built off of. And then now we’re into the era of the functions as a service, which base lines off of our predecessor, which is Docker and Kubernetes. And so there’s still nothing wrong with Docker or Kubernetes, but I like to skip that generation, and by skipping that generation just allows me to have a little bit more focus and not have to focus in at everything that’s brand new, everything that’s coming out right now, because if you do that, you’re just going to cycle. You’re in a constant verge of cycling and revamping and rebuilding and doing it again and again and again. And there’s needs for those, right? But unless you’re running multiple millions of dollars through your infrastructure, the gains for you are probably being lost in terms of resource costs.
And so I’ll kind of explain what we do today from an architectural standpoint, I’d say the vast majority of everything we build is all Serverless today. There’s very little that isn’t Serverless. And usually, the only reason will be is if there is a specific need and it’s going to drop their costs or otherwise. Most of the time, I can get away with using AWS Lambda and being just fine on the compute side. There are times where putting in place Fargate is going to lower my cost substantially or using AWS bot instances for various types of processing instead of using Lambda. But that can be workload dependent, right?
And so, mainly we build everything Serverless, whether it’s in Azure, whether it’s in GCP, whether it’s in AWS, all three of those different ones, we use a variety of them and we’ve started getting to where we’re starting to mix and match a few different things, because one of the big things with cloud providers and that constantly happens, it happens whenever US East 1 goes down, which is the most common one for Amazon to go down, because that’s where they put all the new stuff. And it usually happens right after re:Invent.
David Joy:
Right.
Mike Willbanks:
I swear, within the first three months after re:Invent, one of their regions goes down for an elongated period and brings the whole internet down, and they’ve constantly told people, don’t rely on a single region. Well, people rely on a single region and things go down and the whole internet, it’s like half the internet’s down. So we’ve gotten into a lot of multiple regions and now starting to get into a little bit more of multi-cloud. And the only reason we’re doing multi-cloud is really to use some of the different clouds for what they’re best at.
David Joy:
Got it.
Mike Willbanks:
So we don’t like to do authentication ourselves because we feel like that opens us up to a potential litigation if there was any type of exposure, if we use AWS Cognito, now the onus is on AWS to keep that stuff secure. If we use Google Cloud Identity Platform, it’s on them. Speaking multi regionally, AWS Cognito is not very good for that. You can either segment people off into their own areas or not, and it’s not global by default. And so-
David Joy:
Yeah, I think just, you were telling me last time, it adds complexities that you don’t want to operate on, so you’re like, “Well, we’ll just use something else.”
Mike Willbanks:
Right. Yeah. We came up with a whole design pattern to get around that and we’d lose some of the security abilities that we could do from how we’re passing in passwords. We basically have to pass those into the back end. Then we were thinking, “Okay, well we can do that. Then we’ve got to put it onto some type of a queue. And after that, we’ve got to replicate it to all the different regions for every operation that happens.” It’s like, yeah, that’s going to be a lot of extra work. And so, Google Identity Platform has that baked in to where it’s already global, so why not just use that? Right?
David Joy:
Right. Oh yeah, I’ll let you complete, and I had a thought that I wanted to check with you on, so yeah.
Mike Willbanks:
Yeah.
David Joy:
So I was going to say, when you’re talking about this paradigm that you now follow, design pattern, is use Serverless pretty much everywhere. Are you talking about Serverless in the form of APIs with Lambda functions? Are you talking about the front-end part of it? Or are you talking about the back end itself where you’re bringing in more event-driven architecture to solve the problems that you have in front of you?
Mike Willbanks:
All of it essentially.
David Joy:
All of it.
Mike Willbanks:
Yeah. So from the back-end perspective, we’re building out APIs in Serverless, we’re doing a whole bunch of event-based things in Serverless, our GraphQL layer, if we’re building a GraphQL layer, is in Serverless. We do operate a little bit differently than most, we never really got onto the microservice movement. I’d say we use microservices when they make sense, otherwise we build a monolith on top of Lambda. And so that’ll be like, if we bring it to Node.js, we’re using Fastly or we’re going to use Express or we’re going to use Apollo on top of-
David Joy:
You use Apollo for your GraphQL, right?
Mike Willbanks:
Yeah, yeah. And most of that is just because we can leverage better concurrency control from that, because a lot of the things that you run into is that you’ll end up triggering yourself into where you’re being throttled, and then you have to go ask for request limit increases. Also, you’re dealing with cold boot times and startup times, and you can vastly, for the vast majority of times, you can avoid that by having your monolith sit in there.
David Joy:
Got it. Yeah, yeah.
Mike Willbanks:
It depends on the company size too. If all of your endpoints are heavily trafficked, then by all means, go microservice. But for, I’d say, 95% of the companies we have worked with, it just hasn’t made any sense. So they’re happier, we’re happier, it’s easier to manage.
David Joy:
I think it also navigates the cost perspective where you’re like, “Well, if it’s Serverless, you just pay for what you use. And then if you’re not operating at a place where you don’t really need 20 million users hitting a website, then maybe Serverless is a better…” Is that what goes into the design thinking for this?
Mike Willbanks:
That is a lot of it. And for some of the higher traffic ones, we’ve even just done with Fargate and that has done really well. There’s been other ones where we’ve actually Dockerized the container and gone through more of a traditional-based process. There’s even ones that are on just standard VMs, but we’ve tried to move everybody away from VMs because they’re harder to manage, and then we have to spend time managing them and monitoring them.
And I would rather have everything set up to autoscale with the visibility so that I’m getting alerted in the event that certain things aren’t working, but yet have everything that I possibly can automatically scale up and down, because it’s just a nightmare to manage it yourself unless you have a very large DevOps team. And even if you do, you can really bring your costs down and get massive cost savings from a resource perspective, because 95% of our operating costs in terms of most companies are resource and people, right?
David Joy:
And I was thinking, one of the things that skipped from VMs and avoiding Kubernetes in this particular case makes sense for the kind of size of use cases and the companies that you operate on is because you don’t really want to grade in an SRE and fade the SRE and manage all of that when you can have all of that automated on the cloud with just clicking on the auto-elastic capability that those platforms provide.
Mike Willbanks:
Yeah, getting Elastic Beanstalk or whatnot. And most of those are based off of Docker, Kubernetes, and basically have a zookeeper in front of them. It’s not like a lot of very difficult things, but the more you can automate, the more you can get in terms of most types of savings. And so that’s a lot of what we look at and that’s what a lot of our clients care about, whether it’s small ones or large ones. So, I mean we’ve had several Fortune 500 where we’ve done Serverless infrastructure for them and they’re like, “Wow, I think we’re going to move more of our workloads to this.” So yeah.
David Joy:
What do you think that driver or that change in the space is coming? Now people just don’t want a region, they want multi region. Of course, I mean we’ve talked about resiliency, but they used to have situations where folks are like, “Well, we need disaster recovery,” but Serverless and technologies now allow you to don’t think about disaster recovery because you’re peer-to-peer architecture, you don’t have a master involved in certain of these technologies. Why do you think companies are responding to that kind of a paradigm now? Is it because they feel like they just want a no-touch kind of system where their CEO never gets to know something went down? Is that what it is?
Mike Willbanks:
I think that’s a lot of it. I mean in my opinion, it also reduces a lot of complexity. It’s easier to understand. From a very high level, when we present charts to people and how their system’s going to work, we don’t give them all the nooks and crannies of how all the events are going to be processed because, yeah, that’s complex. But from a high level standpoint, it just looks really easy. It goes into this little cloud thing and then this gets handled over here by this little guy and comes back across the wire and everything, everybody’s happy.
And you don’t have to touch it, you don’t have to say like, “Oh, we need to make sure that we scale up the database at this point in time because we know our peak load’s going to happen between the hours of 5:00 and 10:00, so we need to make sure that we have additional resources scheduled to go in there. But oh man, if it hits Black Friday, we’ve got to make sure that for the next two weeks we shove $100,000 at it for no reason at all, other than we just don’t know what’s going to happen.”
David Joy:
Yeah, completely understand.
Mike Willbanks:
And I think if we go to the data side of it, that’s a lot of it, prior organizations I’ve been at, any type of huge marketing push or promotion, we literally had to scale up environments to prep for that. And then if that marketing failed, it’s like the IT expenditure just in that case was horrible. Like, “Oh wow, it’s not working. Did they turn it on yet? There’s no traffic.”
David Joy:
Right. That’s a lot of waste off resources and finance and things that can be rerouted and rerolled. I also feel like, from what you were saying, those paradigms, I like those paradigms or those patterns because I don’t want to really, in today’s day and age, set stuff up, put somebody’s effort into building that up when I can do research with that person and say, explore different things that we need to do to improve the overall product and the objectives that we have. So, completely makes sense.
I do want to ask you a question with that. Where do you put your most amount of effort then in terms of, where do you find the most problems setting up? Is it in the data layer? Is it in setting up the infrastructure or just bringing everything together? Where do you struggle the most in that setup typically?
Mike Willbanks:
Just trying to think through that. It’s various areas. I guess I can give a few different examples because I would say 95% of our largest challenges for various projects has been likely on the authentication side of things. And that mainly just comes in because some of our clients don’t like to keep it simple. It doesn’t follow any paradigm. So we’re writing against all of these various hooks that these systems put out, anywhere from we are writing our own two-factor authentication methods. Pre sign up, we’re merging all sorts of different types of data and there’s potentially 500 different flows that this person can go through before they even get their account. Those are always the most painful for us at least, the easiest ones are federated identities. You can sign up with whatever method you want or you can just click Google and just sign in and everybody’s happy. Those are awesome.
The other part is, I guess, payment processing is always a nightmare. If you’re using Stripe, it’s always super easy.
David Joy:
Stripe is easy, yeah.
Mike Willbanks:
But then you get into some of these fringe payment processors because they’re saving a point on their percentages and they don’t support anything out of the box other than they give you a hash for the credit card and you can process it. Now, you’re writing your own subscription components, they don’t really… And you’re trying to ensure that you’re following PCI compliancy, so you have to pre tokenize it, but not all of them support it from an API call. So then you’re writing, you have to use their special frame and nothing works right.
So those are our biggest challenges from a technical point of view. Most of our challenges probably comes down to there’s just a ton of different services that you can use and which ones that you should use and what use cases should you use them. And then the last of it comes down to, how are you going to model this data?
David Joy:
Got it.
Mike Willbanks:
And how is that data going to be used? Because oftentimes what we end up finding out is, so we’re designing our system for all this transactional hierarchy and things like that, and from how they want to use that data, not even getting into the warehousing part, it might just be this is ultimately more like an OLAP layer to a degree, they just want to basically see it in very strange ways and indexing that data can be very difficult.
And so then you end up either creating replicas of that so that you can index it differently so you don’t impact your production data sets, but when you’re getting into such things like CockroachDB Serverless where you are essentially replicating that data geographically or using something to that degree, you have to make decisions based off of, “Okay, what’s the data latency that this person will afford me? Can I actually put it into a warehouse and get around it that way? Do I have to hit the transactional system?” And so those are questions I think everybody ends up having is, where do we source the data from? And I mean ideally, you can just push it into a warehouse and be done with it, but there are certain times where it’s like, no, I got to have it real time. You’re like, well, in a programmer’s mind, there is no such thing as real time. It is near real time. We will make it look like it’s real time, but it is not.
David Joy:
It’s not real time.
Mike Willbanks:
Yeah, I was just going to say, we just build off a ton of things that are based off of queuing and basically hooks. So, event sourcing and whether you’re using any of the various different queuing systems, you can use Event Bridge, which we tend to love more now, early on, not so much, but now that you can get global access points for it, it’s pretty awesome. Otherwise, we did all the SNS to SQS and being able to do some more advanced messaging patterns, that used to be pretty great.
David Joy:
Oh, yeah. I wouldn’t be surprised if somebody comes to you and says, “Well, the only thing you can use is Rabbit MQ,” or something.
Mike Willbanks:
I’ve done some pretty amazing workloads on Rabbit and it’s pretty darn fast, but if I can make somebody else have to manage that infrastructure for me, I’m going to do it. Right?
David Joy:
Yeah, it’s going to be a challenge, especially with the way we want to manage infrastructure today. Some people will feel like, “Why are we going so backwards, 2010 maybe?” But what I wanted to ask, or what I intrigued with was you get to touch a lot of different technologies, kind of explore them and test it out. And in doing so, what you’ve done in your career over the last 15 years is get exposed to so many different technologies and have an opinion about them, if not in depth, but at least at a high level as to if it solves a problem or not. And today that problem is multiplied because you have 20 databases or three different cloud providers selling their own cloud databases, or you have Lambda functions for CI, CV, you have Vercel, you have Netlify, you have Amplify, all these different solutions there. Tell me a little bit about one of your fun cloud databases that you tried, or any database that you tried and you tried it and suddenly felt, this doesn’t do what I want it to do? Give me some story from that.
Mike Willbanks:
Yeah, I’m pretty sure we’ve talked about this when we were chatting before, but for me, that’s Mongo in a nutshell. And it’s not that Mongo specifically is a bad database, it’s not. It’s very, very purposeful and you need to make sure you’re using it for the absolute right purpose, otherwise it’s just a disaster. And that can be said about any database realistically, but I feel like Mongo got the web credibility early on when it hit the scenes, but I mean the memes out there are still great, of it’s the fastest database out there because everything’s piped to /dev/null.
David Joy:
That is so wrong.
Mike Willbanks:
But that, and DynamoDB, to me are both two, and I guess even Firebase for me too, those three systems, they’re very purpose built, and if you don’t use them for the right purpose, you really shoot yourself in the foot. And so we’ve done things in Mongo. I’ve done probably 10+ different projects in it before, some for the proper use case and many that weren’t, or started off in the proper use case and ended up not being the right use case at all. And so the biggest challenge when we’ve done things inside of MongoDB, for example, and as to say just in general document databases, is that you end up needing a relational portion of it. And the more relations you end up getting into, the more you end up fighting it.
And so MongoDB, with its aggregation functions for example, are a great example. They’re hard for people to write, they’re hard for people to understand. And so you end up using this whole aggregation pipeline that is essentially like you’re almost writing JavaScript to do everything, so you’re essentially using MapReduce. And so a lot of it just comes down to, okay, we need to, for a relational type system, we need to store the data in this part, but we also need to store it over here. And now we’re managing two different distinct portions of data, and who owns it and who doesn’t own it and what happens when it gets out of sync? Because eventually there’s going to be a programming bug and now it’s out sync and who wins?
And that has been really the difficult part, for simple-
David Joy:
I mean it’s not an asset too…
Mike Willbanks:
… things like my grandma’s recipe book, it would be awesome, right?
David Joy:
Yeah, yeah.
Mike Willbanks:
But most systems get highly relational. So it’s really great at documents, it’s really great at JSON type structures. But we have Postgres and we have CockroachDB and we have other systems out there that have, and MySQL for that matter, have all added JSON types. And now the JSON’s all queryable and indexable, you can’t always get to the nth degree of indexing within that JSON structure, but one would argue, don’t store something nine levels deep in JSON, you probably did something wrong, right?
David Joy:
Yeah, that’s way too deep. Yeah.
Mike Willbanks:
So it’s like, well, use the right tool for the right job. So it kind of goes back to what I was saying earlier even about Serverless, is that if you are not processing an immense amount of data where putting in place Cassandra, Mongo, or otherwise for a document store makes sense to have two different data stores, then just use a relational system that has a document ability, you’re going to get a lot more gains out of that. And then it’s relational, more tools work with it, it’s easier to manage more systems, people understand it, more DevOps folks understand how to manage it because that technology has been around for a lot longer. And even business users are more, like BAs, PMs, et cetera, understand how to query a database.
David Joy:
Yeah, I mean what you’re saying is, I mean in colleges today or anywhere around the world, people are still learning SQL. That’s how you talk to data. That language is not going to go away.
I think my perspective is now in hindsight, after 15 years of looking at SQL, NoSQL, now distributed SQL is that the industry needed something between 2012, ‘10 to ‘18 where we needed scale and we didn’t know how to do it, we couldn’t do it with Postgres. And of course, if you use Oracle, you had to do Golden Gate and go through this complex. With Postgres, you had to do sharding and added all of these things, and then NoSQL came in and then Mongo came at the right time, say, “Look at this, how you can just directly write a document to your database, fetch it back, goes with your API story. Everything.” It made sense.
But as you started scaling, I felt like we realized that you cannot continue to operate it. And again, I like what you said, it is a very purposeful database. Even DocumentDB, I’ve had use cases where I’ve used or come across folks who have used DocumentDB, works great initially, but you scale to say 64 terabytes of data on it, and then you’re freaking spending millions of dollars on your AWS bill, which is a lot. So, unless you’re Capital One, you don’t want to do that.
Mike Willbanks:
Yeah. No, I mean I’ve been there. And I guess the other part, since you brought up costs, for me Mongo, you don’t have guaranteed asset consistency unless you’re running three of them, right? And so your cost is a lot higher, especially for somebody like a startup who is bootstrapping. They don’t want to spend for three different database servers that they’re not even using just to get consistency. It can be difficult. So, I think I just kind of err on, for me, a lot of it comes down to, one, where can I find people to work on this stuff? Because that’s a huge thing. If I’m hiring people and I need people to be able to work for me and I need to be able to find mass market, I can’t go out and pay everybody top dollar to find a technology that not everybody’s using.
And so technology adoption for me is I tend to lag behind a little bit because I’m probably already actively researching it, but is that actually going to hit the mainstream or not? And so a lot of things don’t. We’ve had several instances of that where, like I say, skip a generation and you generally find better. Great example with even programming languages is we don’t really hear about Scala anymore.
David Joy:
We don’t.
Mike Willbanks:
That was the biggest thing forever.
David Joy:
And actually, when you said it, that’s when I realized, 2016, ‘17, everybody was like, “Let’s use Scala. Let’s use Scala for everything.” And everybody was like, “It’s so great.” And then I think I had never used Scala, I started using Python for my stuff and I just stuck with it. So yeah, go ahead, sorry.
Mike Willbanks:
No, it’s amazing. It’s kind of the same thing. And now, instead of Scala, everything’s moved to Rust because it gets a lot lower level from all the different compilations. And so it’s like, okay, well Rust looks like it’s actually getting some real legs now, and there’s a lot of awesome tooling around it.
David Joy:
Yeah. And another one that comes up a lot nowadays in my conversation is Go where people are like, well, they love Go now, because it’s just so much more easier, the abstraction is way more easy.
Mike Willbanks:
It’s been around forever though. I mean Go has kind of had that, I don’t know, I haven’t looked at the charts, but I feel like it’s almost ebbed and flowed at times.
David Joy:
Yeah, I don’t know-
Mike Willbanks:
It’s just not going away, but it kind of regains popularity then goes out of flavor and then again comes back in.
David Joy:
Yeah, CockroachDB is written in Go. That was a fundamental choice that the founders at the team initially took. And I think the more I talk to people and the more I’m trying to use Go, kind of started liking it. But I still love Python because I’ve used Python for so much data-related stuff, and I like the way it did some other things. Very easy to get started with.
But I wanted to go back to what you were saying around standardization. Over a decade or over a few years, standardization happens. If we have seen Kubernetes has kind of become a standard we’ve seen, now they’re getting adopted in the cloud with EKS or GKE, we’ve seen Serverless as a pattern become accepted, and then we have, so Lambda function that all the other people doing, we’ve had Spark become a standard for data processing, go on back from Hadoop, not something folks are using. So we’ve come to an era where standardization is happening, people are sticking to specific technologies for specific use cases. But in the data space, in the database space, that’s not happened because there are 20 different options. So when do you think we’ll come down to say, “These are the three databases, the four databases that we are going to use for everything?” Because it’s so difficult to get to that… Why is that happening? What do you think?
Mike Willbanks:
I’m going to say it’s never going to happen, ever. I mean we almost had it there at one point, but databases is kind of like the Wild West. It’s never really going to change. The only thing is, can we stick to the ANSI SQL standard at least? That would be nice.
David Joy:
That would be great. Yeah.
Mike Willbanks:
I’ve been a part of multiple things back, wow, it’s 10, 15 years ago now already. Holy moly. Time goes fast. There’s a database called InfiniDB that I ended up working with. And for those people that never knew about this thing or what this thing was is that it was a purpose-built database for data warehousing, and they were one of the first ones to do MPP, which is multi parallel processing.
And so, I think we’re always going to have those things where there’s certain types of databases and systems that are going to be built because we’re just not doing certain things that works well across the board. And the problem with the data space is that you have so many different needs, right? You have the mom and pop shops, you have the small companies, you have the mid-size companies, and then you have the uber large companies, and they all have various different needs.
If I’m storing, let’s use your example of 64 terabytes of data, I have a very different use case for how I use my data than a smaller startup that’s got maybe 500 clients, but we both end up using the same type of tooling. And so how do you optimize for both of those cases? And I think the answer is you just don’t.
David Joy:
It’s difficult. Yeah.
Mike Willbanks:
It’s like you can get a generalized solution, but then you’re going to have to get specialized at certain points. I mean that’s kind of the whole thing with going Serverless. At some point, you might need to start separating things off and becoming more specialized. And it’s the same thing within a company. When you start a company, you have your two founders, and as time goes on, you stop becoming a generalist and become a specialist. And that happens with everybody’s careers, is that you need to eventually decide what you’re going to specialize in.
David Joy:
Oh, yeah, yeah.
Mike Willbanks:
Or if you’re like me, you just keep not specializing in anything and try to specialize in everything.
David Joy:
Everything.
Mike Willbanks:
And that’s problematic as well.
David Joy:
Yeah. Well, I think it’s interesting, especially what you said; there is this thing that there’s so many things coming. Technologies are coming and they’re active and everybody’s using it, like say, cloud. Within the cloud, say if you take AWS, AWS has RDS, they have Aurora, they have DynamoDB, and sometimes they will also create these open source solutions and bring it in as their solutions. I’ve had experiences personally where cloud platforms like Azure once started a technology, said, “This is what we are going to do,” and got a big set of engineers to work on it, started building the product, got some customers, two years down, no innovation on it. It’s still the same, still has those problems.
So my biggest problem now, when I’m trying to select technologies and when I’m trying to do my own use cases outside, trying to build my own applications, I realize that I cannot trust certain systems to continue to innovate, continue to build. So I am leaning more towards technologies that open up and say, “Well, we are open source, so the code is always going to be available if something happens,” or I’m leaning towards, “Hey, can I scale on this technology?” And those are some of my own preferences for designing. Even if I just have say, one terabyte or two terabyte of data, it’s good for me to know that even if they shut the doors down, I will have open source code. So, do you get into situations where you’re like, “Well, we don’t really know if this technology is going to shape into something,” of course, when you’re exploring something and then pivot to something that is open source? Does that happen often?
Mike Willbanks:
Yeah, actually that’s one of our, or I guess one of my principles of looking into things is, let’s take CockroachDB for example, because it’s a great example. It’s solving a great deal for us from the standpoint of we’re finally getting multi-regional replication without having to manage it myself and dealing with master/master replication and potential issues. So what happens if Cockroach CB was to go away? Well, the great part is that, one, it’s open source, two, the even better part is it’s based off of a standardized protocol and it is the Postgres protocol, so I can go to any Postgres-capable system and basically import my data into there and I’m good.
What happens if the company suddenly shutters? Keep a backup of your data somewhere like Amazon S3? So keeping a backup just in the event that that company completely goes away, or you get the million-dollar mistake that we all sometimes make in our career of dropping a production database or deleting the entire hard drive or something, and they can’t restore it for eight hours or whatever. Those are just typically good things. You don’t throw away your whole disaster recovery plan, but you still need to have one.
But I guess it doesn’t directly answer your question in that sense, but it’s more, I look first at what are the kind of protocols that are being used underneath the hoods? Is there a replacement technology that would work for this if the event happened that I needed to move off of it? How hard is that conversion going to be? Those are all things that come into mind.
The harder part comes into play when you go more into those Serverless structures. If you want to make it work across the board on everybody’s stuff, you have to put in a lot of time and effort on using something like a Pulumi or otherwise where you’re doing all your orchestration inside of a system like [inaudible] or Terraform, I guess you can bring Terraform into this as well. I like the Pulumi API, so it’s just kind of my thing, but-
David Joy:
[inaudible] I wouldn’t be surprised.
Mike Willbanks:
Yeah. And then making it work inside of all of the different clouds, right?
David Joy:
Right, yeah.
Mike Willbanks:
And that can be challenging and that can be a lot of extra resources. And now you’re back to like, “Well, maybe we just use Docker or Kubernetes or whatever and manage ourselves.” But most of the time, vast majority of time, you’re going to pick a cloud provider and you’re going to stick with it. And to your point of things dying and going away, I would say GCP is basically my last love of any type of cloud provider because they have a very good history of just randomly killing things. Their entire IoT product got killed this year. It’s like, you killed off an IoT product. Why? Okay, sure.
David Joy:
They’re just operating-
Mike Willbanks:
That’s why they’re not growing as fast as everybody else, but they’re growing really fast in terms of the AI space. They’ve got a lot more tooling for AI than pretty much every other cloud provider, even though Microsoft’s pretty much the main investor of OpenAI, but okay.
David Joy:
Yeah, I think that battle, I mean it’s going to be interesting to see how that shapes up, but I’ve had similar gripes with GCP where they bring out something and then they don’t work with you enough to talk about why it’s good. Whereas what I’ve seen with AWS is they lean in and they’re very customer focused and they’re saying, “Well,” and I have a very high opinion on it because I feel like I’ve tried pretty much everybody and Azure sits in the middle like an elephant because they have been in the space and selling to every damn software company for the last 30 years, so they have a door everywhere. So everybody kind of has some sort of Microsoft, but I’ve generally seen GCP give bad experience sometimes, but yeah, again, that’s my opinion.
Mike Willbanks:
Yeah, I know. That’s what we’re talking about, those opinions, right? In architecture and software. I think everybody’s got an opinion on what languages they love, what databases they love, which cloud providers they love.
David Joy:
Yeah. I think what doesn’t go away is the fact that you’re still trying to solve a problem at the end of the day. And jumping back into what you do at Spark Labs, do you guys specifically get into certain use cases or domains where you’re really good at? Tell us a little bit more about what you guys are really good at or focus domains or you just are generalist when it comes to use cases as well?
Mike Willbanks:
We’re fairly general I’d say overall. There’s not much we that we don’t touch, but I will say that we have our main area that we’ve kind of specialized in for the most part. And I’d say a lot of that comes into kind of AWS and Serverless back ends and how do you make those things scale and how do you handle that across the board? And then on the flip side, if we get into how do we deliver our different solutions, we’ve leaned in very heavily into React Native, and I would say Expo specifically, which is kind of built on top of React Native and gives you a whole bunch of different tooling on top. We’ve gotten into where we complete… Expo released something called config plugins, which you can basically take any type of React Native functionality, or just complete Native functionality and bundle it in so that you can get it into your React Native application.
Likewise, we do a ton of different React front ends, basically for most web pages or PWAs, et cetera. I’d say that we’ve kind of steered clear ourselves of doing any type of SSR processing where, that’s the newer thing now of doing a server-side handling of certain types of your components. We still kind of put the kibosh on that inside of our group, but each person has their own thoughts on that. My thoughts just go back to watching the industry have these cycles every 10 years. For the last, previous to this, the last 10 years was, “Server-side rendering is terrible, it’s so bad you should never do server-side rendering,” but where now it’s like, “Oh, well, server-side rendering is okay. You should be doing server-side rendering.” I’m like, “You can’t have your cake and eat it too, guys.”
I know that we’re cyclical. This happens in the software development space, happens in the system space. We keep going back and rehashing old ideas because the old ideas weren’t necessarily bad, it’s just that they need to be rethought. And so the same thing lends to server-side rendering where it’s like, “Okay, if I’m building a CMS, yeah, I’m going to be doing some server-side rendering,” because it makes a lot more sense to do that there than on the client, but all the client’s devices are getting so powerful that I would much rather them have to take the compute cost than me.
So, I think it’s just an interesting area right now and interesting space just overall in terms of how we do everything. And then we’ve done a ton of different warehousing projects where we’ve helped organize data, structure it. We don’t usually get into too much of the machine learning side of it, but there’s been times we’ve been asked to do machine learning or AI image generation on one specific project where we’re basically taking a product and kind of masking it into another thing and showing what something might look like after the fact. Helping people merge their stuff from these monoliths more into a micro serviceable area. Like I said before, we prefer the monolith, but there’s use cases where you should be using Step functions or Lambda functions, or how do we take all this big layer of our AI process and how do we make our compute Serverless from that? And we get into some of those things too. So, it’s just across the board.
David Joy:
Yeah. Well, that’s good to know. I mean on that topic, I mean you’ve basically handled pretty much everything depending on what use case, and I’m noticing a lot of FOMO in the industry when it comes to AI right now, a fear of missing out and everybody’s like, “How can I use generative AI?” And a lot of people are like, “Well, are your developers using Copilot?” And things like that. So what are your thoughts on where the space is with companies coming to you and saying, “Can we use this with OpenAI?” Are you guys exploring any of that?
Mike Willbanks:
Yeah, so interestingly enough, we’re using a lot of those tools today. So our entire team is using Copilot as well as ChatGPT Pro.
David Joy:
Nice.
Mike Willbanks:
I feel like, let’s start just from the developer perspective, is you’re going to get left behind if you’re not using some form of generative AI, you simply aren’t going to be able to compete with the competition because it allows you to do just so many more things, but you also have to have the knowledge of when it’s lying to you or it writes something very horribly-
David Joy:
Hallucinations, yeah, yeah, yeah.
Mike Willbanks:
I have conversations with ChatGPT trying to tell it how to improve its answers that sometimes might be paragraphs and paragraphs long because I’m like, “Yeah, what you wrote to me is how I might do this five years ago, but even so, five years ago, this is still a really poor algorithm. Where did you find that? That’s a bad idea.” It’s like, okay, you’re using Big O exponential in something where that should have been like an O(log 1). Stop doing that.
David Joy:
But that’s the part, right? What you are saying, that’s where a developer, a seasoned developer or anybody has to look at the AI-generated code and still needs intervention because a lot of people feel like they can replace a developer with an AI, and I don’t think that’s true. I think AI will give you great boilerplate code, but that’s just the beginning. And of course, it helps you save, say, three hours of work, and then you get from what you have to do at three hours in one minute. Anyways, yeah, go ahead.
Mike Willbanks:
I use most of those tools like Copilot and OpenAI as, I’ll say the ChatGPT area, especially the Model 4 is my assistant. It’s my personal assistant. It makes me more effective. And then Copilot, it does a pretty good job of doing auto complete, but it will also drive me nuts because it starts recommending stuff at the end of things that I’m like, “Stop recommending. I’m done with this line. Get out of here. Get out of my way.”
The hard part is I used to be a huge Vim user. I’ve kind of converted over to VS Code now, and it was a very painful last year of getting myself to stop doing that, but it was just because, in Vim there’s certain types of stuff that just doesn’t work well. A lot of this generative AI stuff doesn’t work how I’d really like it to work in Vim. And so it’s like, yeah, I got to get past that. Plus I’m in and out of 18 different projects and it can be just a variety of just mashups every day. So I’m like, “Okay, I’m going to leave that. Go more to the higher level now.”
David Joy:
I personally like VS Code. I used to use something else before by, the name’s not coming to me, but I used to use-
Mike Willbanks:
PyCharm? If you were using PyCharm-
David Joy:
Yeah, PyCharm, yes. I’ve used PyCharm, IntelliJ stuff. I’ve used that and I used to use some notebooks and stuff also, but VS Code now it feels pretty good. And I was using an auto complete integration with it called Tabnine. I don’t know if you know Tabnine.
Mike Willbanks:
I remember hearing about that some time ago. I don’t remember what it is-
David Joy:
Yeah, it’s used to be something like that, same thing. Just you put your variables and you put one, it’ll auto complete stuff for you, but I think GitHub Copilot goes a little bit further ahead in producing that experience for you. And I know AWS has come up with Code Whispers or something, which is a very weird name in my opinion, Code Whispers, it’s kind of somebody next to you just whispering code, kind of scares you. But I think I generally also feel like ChatGPT 4 is my assistant. For somebody like me who’s constantly trying to do things and I don’t want to pay an assistant, I think ChatGPT 4 works out for me. But again, I like the way you’re thinking about using it and I feel your opinion on using it definitely has to be something that companies have to look at. So, yeah.
Mike Willbanks:
Yeah. I think that’s really… So just to stick with the individual part of it from a developer standpoint, I also feel it’s going to make the barrier to entry a lot more difficult than it used to be because overall people are becoming more effective. And so, as we become more effective, that barrier for an entry-level developer grows even further. But we also needed it, because there’s always been so much more work to be done than could be done. And so, it’s kind of accelerating that, but there’s pluses and minuses to those things, that means that companies are going to grow a lot faster, people are going to come out with more startups faster, and so the market’s going to get more saturated in a lot of ways.
And when we start saturating the market, it’s basically a drive towards zero, right? How can we lower the cost of various different things? How can we do things differently? And the big players can take advantage of that as well by kicking people out of the market. And so it’s kind of a catch-22 from that sense. But I mean, when we talk about actually building products with generative AI, we’ve done a few of them, a lot of fun.
We’ve done one where we made a solution for a person that’s a little bit more technical and they’re kind of in the workout space. We actually built them the ability to essentially write SQL queries using Handlebars and build that into a pipeline to where you can basically build up a variable, execute the next statement to build up more variables, and then at the end of it, you basically can create a Handlebars template that’s taken in all these different variables that are based off of their individual user, dynamically generate them an API endpoint for this, which then goes and triggers a request to OpenAI’s ChatGPT, structures the data back in JSON, and gives it to them inside of their app so that they can basically just build anything generative that they want to produce their user.
David Joy:
Oh, that’s pretty cool.
Mike Willbanks:
And it actually goes and recommends to them, the first use case was; here’s the workouts that they’ve done, here’s what they’re interested in, here’s the workouts that we have, here’s our different categories. Suggest to them a workout that you would recommend that they do now and state the reasoning why they should do this workout versus some other workout and just basically provide them like, “Hey, you should do this workout next because you’ve done this category already and this one’s going to be better for you because you worked out too hard, your last few workouts were really hard, now you should do a recovery ride,” for example.
David Joy:
So it’s basically kind of reading real time what you’ve done, what you’re doing, and then, I’m assuming you’re hitting the Open API and you’re sending a prompt, the prompt generates a response, you then render it into the front, you get that experience, put it back, and you have a bidirectional conversation going on. Is that how it is?
Mike Willbanks:
Essentially yeah, but we just do a single conversation in this part where the whole thing is basically building up all of these different variables because you can send a very large message to-
David Joy:
So it’s like one session.
Mike Willbanks:
Yeah, it’s just one call for that one. Later on we’re going to allow for pipelining some of those things, but overall, there’s really been nothing that we couldn’t already do with what we’ve done, unless we want to use that generative content to base it for another prompt afterwards. And so, it’s pretty fun, it’s pretty efficient. We’ve already built it. We thought about building a product based off of it, but realistically, it just doesn’t make a ton of sense to… I mean we can generalize the solution and open source it, or we could… I really just don’t see where that’s going to be super valuable for a lot of different people, because right now it’s very focused on the technical front where we basically said, this would be mind-blowingly good, but again, it just wouldn’t make a lot of sense for us to build it is, why aren’t all the business intelligence platforms, like the Tableaus and the Salesforces and those things, they all say that they have AI, but they don’t really have generative AI built into them and frankly, their AI is just not very smart.
And so it’s like, I have all of this data, help me make sense of this and here’s my demographics, here’s this, here’s our target markets, this is how things are performing from our Google Analytics and who our market is from saying that, and here’s our advertising data, what are we missing here? Who should we be going after? I just think that we don’t ask the proper questions. And we have all this data and we can make sense of it in so many different ways, but even with generative AI, if we don’t ask it the right questions, we’re never going to know reasonable answers.
David Joy:
Yeah, yeah, yeah. I think-
Mike Willbanks:
And so that’s the hard part.
David Joy:
Yeah. I think what you said, that’s a use case. I know some companies who are working on integrating generative AI into BI solution, like Google Analytics is doing that or Data Studio. I think that’s going to happen. I think there is a push for removing the junior data analyst role, which is not going to happen, with that. You still will help need data analysts to see, because there is so much hallucination happening even still. And what it’s doing is basically is probably reading the data into a pandas library and then applying some histograms to get some information and then produces a result back to you. So I definitely feel like that’s a use case, but the thing is, if you want to build a product around it, to your point, the effort might not be worth it because at some point, all these companies are going to say, “Well, we need to build that feature in,” and they’ll just put it in, integrate that with generative AI.
Mike Willbanks:
Right. And that’s really kind of what I thought too, is the juice isn’t going to be worth the squeeze at the end of the day if you were to build it, unless you’re kind of first to market, which there’s already solutions out there, just none of them are super good, but a lot of them also just get into, right now, generative AI and doing AI is really expensive. It costs a lot of money to run those workloads and sure, you can get those costs down, but you have to be very targeted. And so, it’s like all things with computing; in five years from now, we’re going to probably be paying peanuts for what it is, whereas right now the same workload, let’s just say to run all of this big data through it that we need to go do to get an answer is going to cost us $10,000. Or five years from now it’s probably going to cost us five cents. It’s just exponential in terms of that, and so you have to kind of base that off of, how valuable is this going to be for my business?
David Joy:
Yeah, and that’s the thing with ChatGPT too, right? With OpenAI, it’s not easy, each of your API costs is pretty expensive, especially if you have a large token count, it’s not going to be cost-effective right now. And that’s why I think it’s good that there is competition in generative AI space. I’m exploring Claude by Anthropic, as well as I’m using Bard and PaLM 2 models and testing some of those out on site. And I think that kind of a combination makes sense.
I wanted to go back and comment on, you’ve come a long way, Mike, from hacking systems and MAC addresses in the late 90s, early 2000s to now, generative AI, it’s been a big jump, where technology is. Where do you think we are going next? I know everybody’s talking about AI. How do you anticipate things are going to look like in two years in terms of-
Mike Willbanks:
I think two years is a little bit more predictable. When we start getting five, 10 years out, it gets pretty unpredictable. Where are my flying cars? We were supposed to have those five decades ago, right?
David Joy:
Oh, man. [inaudible].
Mike Willbanks:
But I think in terms of where we’re at today and where we’re going to be in two years, we’re definitely going to be doing a lot more artificial intelligence. I think that’s going to be baked into just about everything we do. From a market standpoint and from a life standpoint, it’s just going to be more and more baked into everything. And I think it’s one of those fundamental shifts that we’re going to be seeing in the industry, those don’t come around a lot. This is going to change the world, it’s going to change our environment, it’s going to change how we do things.
From that standpoint, how does it change a technologist’s or a developer’s role or a systems role? I don’t think it’s going to change it all that much. You’re still going to need people. AI is not going to replace programmers. Somebody is still going to have to write code. Yes, it might get better at it, it might make the barrier to who’s writing it a higher degree, maybe eventually we don’t really have mid-level developers anymore, we’re sitting with staff engineers and architects.
The thing is somebody’s still going to have to review it. At the end of the day, somebody’s got to be responsible for it and generative AI might be able to build out most of the things for you, but it’s not going to be able to detect everything for you. At some point, yeah, we might be able to get to the point where we’re kind of in a Marvel Ironman movie pointing and manipulating things with our hands and shoving things back and forth. That’s kind of coming in, what? Another year, we’re going to have-
David Joy:
Vision Pro.
Mike Willbanks:
Vision Pro, right? But also that’s a $3,500 computer that you’re going to wear on your face. I can’t justify spending $3,500. I think my wife would kill me if I did that.
David Joy:
Same here. We have a lot in common there.
Mike Willbanks:
Yeah. I mean, is it cool? Absolutely. Will I take one? If work will pay for it? Sure, I’d love it.
David Joy:
I’m excited for that kind of an interaction with technology, honestly. But at the same time, that’s part of being on the innovation spectrum. If you’re going to be that innovator, get that product, do get the benefit of testing it out and having an opinion. But if anybody has a shot of bringing something like that, I think it’s Apple.
Mike Willbanks:
It is.
David Joy:
I-
Mike Willbanks:
I think you go back to things like Google Glass, they were just a little bit before their time. It was a little too soon. And now I think if they were to release something like Google Glass, it would’ve been phenomenal and huge uptick.
David Joy:
Exactly.
Mike Willbanks:
Because we’re used to it. We’re used to that type of thing. But at the same point, what keeps getting asked and ignored in the AI space, the machine learning space, the headset space is, what is this doing to one, our privacy? And two, what is it doing to ethics across the board? Because there’s a lot of different things there. I mean ethically, some of the stuff that can come out of AI is not ethically sane at all. And I mean I could feed it your whole profile and have it give me a profile on you that determines various things. I mean I’ve done a couple different things, and it’s really hard to make it do it, but using AI to do some parts that are content driven, and then you do other parts that are more just kind of core algorithmic based and things like that.
So I’ll give you a great example of a concept that we kind of tried out that might actually go out at some point. But essentially, I can tell ChatGPT, like, “Okay, here is a job posting. Here is some additional things that we’re looking for. Now, here is a resume.” I just fed ChatGPT your resume. Now ChatGPT knows about you, but at the same point, and I go, “Okay, I don’t want you to take anything into consideration here other than these aspects; be as gender neutral as possible, be as sex neutral, be neutral across the board when it comes to all of these different factors. And now I want you to tell me what makes this person a great candidate and what are their red flags?
And it actually does a surprisingly awesome job at that. But at the same point, now is that ethically sane? I’m feeding your resume to a third-party company? Sure, I can just put it in my terms of service and off I go, and our privacy policy, “Hey, we told you we were going to work with third parties.” What they do with that data, I have no idea. This gets into some of the new California legislation, or it’s not new anymore, and kind of GDPR, but the rest of the states are a hodgepodge of just randomness. And so I don’t know, we’re kind of at that area where it’s both super exciting and just super terrifying. Machines are going to be making more decisions for us and based off of us than we probably care to know.
David Joy:
Yeah, no, I 100% agree with that. I feel like we have to use the technology. We have to use it in an appropriate way, but at the same time, we have to, people who are using, be enough vocal about keeping the ethics around it. It’s funny and it’s weird that I’ve just went and watched Oppenheimer and then we did develop something that was cutting edge. I mean forget about what they did with vision and the whole atomic bomb was they came up with this fantastic way of using this technology, a breakthrough in itself, but then again, what it created was it changed the way the world really is today. And I feel like AI is at that level where we do need to be vocal about the ethics around it. And I’m glad that, if you guys are using it, you’re also kind of bringing that idea that we have to look at privacy, look at data as you develop applications around it for users. So that’s pretty good. So yeah.
Mike Willbanks:
Yeah, I mean it’s powerful. It’s going to help a lot of things. It just creates a new set of problems that we don’t yet know how to solve or understand. And I don’t think regulation’s necessarily the answer for that. There’s probably going to be some components of that, but no countries are all going to agree on the same standardization of it.
David Joy:
It’s going to be very difficult-
Mike Willbanks:
That just becomes a mess for every other company using it.
David Joy:
And think about how many different types of AI models there are going to be running around soon. We already can count about at least five to 10, and then there are already other people exploring different things. So-
Mike Willbanks:
Oh yeah. Just all [inaudible] and take a look at all the different models that exist already. There’s millions, and some are based off of others and some are brand new concepts and some of them have machine learning concepts built into them, some of them don’t, some just require pre-training, some are continually learning. It’s just like, whoa, we thought that we already had so much data available to us and now we’re generating data to fill even more data to try to consume. That’s part of the reason I even think AI has to exist anyways, because we can’t even consume all this data. We have no idea.
David Joy:
Oh yeah, yeah, yeah. It’ll be interesting though. The one thing I was curious about, and I know we’ve went over the time that we have, but we’ve had, historically, maybe because I’ve watched Oppenheimer, I thought that when I was watching the movie, one of the things I was really curious about was that they came up with original thought and original ideas on learning about the world and the way it is. And it came through a lot of thought, a lot of calculation and a lot of other things. I would be really surprised, at this point, because what we have with AI right now is we have trained whatever we have available, so it’s basically trained on information and things that we already understand. I want to see if we will have some major breakthrough in physics because of AI. If that happens, that’s the day I’d be like, “Okay, this is really serious. I really want to see that happen.” So if it happened or not, I don’t know, but it’ll be interesting to see.
Mike Willbanks:
Yeah, I think that that industry and that area has been taking off. I mean think of all of the different revelations we’ve had just in medicine in the last couple years. It’s just outstanding. And that’s just from machine learning and being able to compare all these different ingredients together and we can take the blender approach, which would, for us, take an average human years and we can just throw it into the computer and it’s learning. And it kind of goes back to when computers were created. We had punch cards and they sent punch cards out and now the computing power is so much greater and now we’re getting into quantum computing with AI? That’s going to be pretty insane. The amount of stuff we can process and the variations of that, it’s just going to be incredible. And so-
David Joy:
That’s why when the news came out last week, did you hear about that on the Archive paper, the room temperature semiconductor?
Mike Willbanks:
I didn’t.
David Joy:
Okay. Yeah, you should go check it out later. Probably when people get to hear this episode, they will know when it was recorded because I think it still needs to be peer reviewed, but what it came out was, what they’re saying is they have been able to develop a room temperature semiconductor, which is game changing. It’s kind of a big discovery because if that can be done, it can be applied into quantum mechanics and quantum computers, which I feel is one of those next areas that’s looking for a big breakthrough. So yeah, it’ll be fascinating. By the time we have our second conversation, we’ll have some [inaudible] go down.
Mike Willbanks:
Yeah, we’ll have plenty to talk about, that’s for sure.
David Joy:
I know.
Mike Willbanks:
Yeah, no, I mean it’s just great and I mean I love the technology side of it. It’s just there’s so many different areas to explore and we’re just scratching the surface right now. AI’s been around for a while, it’s getting to the point where it’s reaching a little bit more of a mass market because it’s gotten so much cheaper to run based off of where GPUs have gotten and CPUs have gotten. I mean we aren’t really innovating in the CPU space anymore, we’re now completely into the GPU space trying to get those going. And so, that area still has a long way to go and they’re just scratching the surface on it. Right?
David Joy:
I think we’ve reached the limit of the amount of semiconductors we can put in, the amount of things we can pack into a chip, I think we’ve reached that limit or reaching that. Isn’t it Moore’s Law or something? I think we’ve already kind of gotten to that limit and I haven’t seen any chip from Intel that’s come out that’s really excited me. I did like the new ARM chips, we’ve taken a weird tangent on hardware, but yeah, I mean I like the new ARM stuff that Apple’s produced, but I do feel like the quantum is going to be very interesting as it kind of breaks through into the mainstream.
Mike Willbanks:
Yeah, I think that’s kind of the next space of it, is there’s not really a lot of other space for CPUs to go. Yeah, we’re toying with some different architectures and things of that nature, but nothing’s really been exciting from that standpoint in quite some time. I mean the ARM architecture has been kind of a nice thing, but I mean those types of architectures aren’t anything new, it’s just that the industry was ready to adopt something else that could lower the price point of implementations. And realistically, within the IoT sector is we needed lower powered chips that weren’t going to be super power hungry, that were able to process quite a bit of data, and that’s kind of where ARM came from. Before you’re trying to run Celerons and they just didn’t work.
David Joy:
Yeah. 100%. Wow. Well, this has been such a fascinating conversation because we went from a tangent in Spark Labs and talking about your early career to talking about Oppenheimer and a little bit of quantum mechanics, and I think great conversation around AI. I’m really excited, Mike, on how you and Spark Labs make a difference by using all these different technologies. It’s been an absolute pleasure having you on the podcast and the way I see this, it’s one of those first conversations. Again, we’ll hopefully run this down again and do a second conversation.
Mike Willbanks:
Absolutely. No, I enjoy conversations like this, so anytime. I mean maybe we just dive into a single subject later.
David Joy:
Yeah, we should probably do that. I think with you and I, it’ll be really difficult for us to do because I think you’ve worked on a lot of different things and I have passions for different ideas as well, but you did a great job of helping us through understanding how you’re designing stuff and thanks for your opinions and all opinions are welcome here, so I appreciate it.
Mike Willbanks:
Awesome. Thank you so much.
Big Ideas in App Architecture
A podcast for architects and engineers who are building modern, data-intensive applications and systems. In each weekly episode, an innovator joins host David Joy to share useful insights from their experiences building reliable, scalable, maintainable systems.
David Joy
Host, Big Ideas in App Architecture
Cockroach Labs
Latest episodes