We wrote the book on distributed scale. Literally.Free O'Reilly Book
AWS Lambda is the cloud service that introduced the concept of “serverless”. No machines or operating systems to manage; automatic and hassle-free scaling; only costs money when it’s doing anything useful; and more besides. Lambda was launched in 2014 - since then people in the know have come to realize that Lambda is a much more effective and relaxing way of running code in the cloud than many alternatives.
I first used Lambda in 2015. At the time I was managing a team that had a very large, very expensive Java application that processed a few tens of millions of messages per day. The problem was our application didn’t scale sufficiently, it was very slow to deploy, and was a massive headache when things went wrong. We needed to replace a critical piece of the company’s ecosystem, with only a skeleton crew of people. What were we to do?
We tried an experiment. Since we had little spare time for managing systems we decided to lean on Amazon. We needed a service to handle many messages, and a compute platform that could efficiently handle oscillating scaling needs. We chose Kinesis and Lambda, and because our team was one that used a variety of Java Virtual Machine (JVM) languages we chose Lambda’s brand new (at the time) Java support.
The result was a resounding success. Sure, Lambda had a few rough edges back then, but our legacy-app headache was solved. I and John Chapin, who architected the system, were so impressed by the capabilities of Amazon’s “low touch” services that we decided to start our own consulting business - Symphonia - where we could help others with this brave new world of software architecture.
It’s now more than 6 years on from our Lambda + Java experiment, and one aspect of it is still something that causes controversy even among serverless converts - Java. Many folks consider Java an unsuitable language for Lambda, while others (like me) think it has its place in the Lambda universe. So should you use Java for your Lambda apps, or should you steer clear? In this article I hope to allow you to decide for yourself.
There are many arguments I hear to not use Java with Lambda, but they basically come down to these three:
As someone who literally co-wrote the book on using Java and Lambda, these points are a little painful. Why did I spend so much time on so many words?
Perhaps before giving up hope it’s worth digging in a little.
When people say “Startup times are terrible” they are referring to Lambda “cold starts”. Understanding the details of cold starts requires an article, or perhaps even a book (ahem), but the short story is that the Lambda platform occasionally “cold” starts new instances of a Lambda function when they are needed, but the platform does so just-in-time when events occur, rather than pre-emptively. This means that some events experience extra latency.
However, a couple of relevant questions at this point are “how much slower is Java?” and “does it matter?”. To the first of those I did some research a couple of years ago, and the results were “typically about a quarter to half a second slower”.
The second question though is much more subjective. Back in 2015 cold starts were a lot slower than today, however in our Kinesis+Lambda application we were processing messages that were already at least a minute old, and further we were fine about waiting a few more minutes for a message to be processed. To us, therefore, a cold start of even several seconds was fine.
Another example - I was working with a medium size social network company 3 years ago, and they were switching some of their Scala code (which runs on the JVM) to Lambda. They were very worried about cold starts since they generally wanted sub-second responses, but their cold starts in development were more than a second (not surprisingly, given the extra weight of running Scala and not just vanilla Java). However, it turned out that in production their Lambda functions were being triggered so frequently that their 99.99% response times were adequate. While the cold start times were the same - a second or two - they were only happening once every 100,000 requests or so. For this team even though a few lambda calls were slow, their aggregate performance was perfectly acceptable.
So yes, Java is slower than other languages at startup. But whether that matters depends on what you are building, and what your performance requirements are.
There are also some techniques that can help when you’re on the borderline, but I’ll get to those later.
According to various reports, there are probably between 5 and 10 million Java developers in the world, and Java is still in the top 5 most-frequently-used programming languages. I find these numbers interesting, but I also have some amount of skepticism in the surveys that produce them. Much more tangible to me is that AWS themselves are huge users of Java. Not only do they produce their own free distribution of Java, but they also use Java themselves as part of the Lambda service!
So yes, people are still writing Java - many of them, in fact. They just might not be as vocal about it as other people are about other languages.
Many people who’ve written Java have memories of it being very verbose, at least in relation to more modern languages. This is partly because the Java language was stuck in the doldrums in the late 00s / early 2010s. But even today, even after significant language improvements in recent years, Java still has brevity concerns in comparison to other languages. For example:
All of which adds up to an amount of “weight” that makes Java unwieldy for small Lambda functions.
However, a good number of Lambda functions aren’t actually small - you can do a lot in 15 minutes with 6 CPU cores and 10GB memory. And further, teams that are used to Java are also used to its shortcomings, and with good IDEs and tooling, are actually typically no worse off than other software teams.
Most of the code I write these days isn’t Java, but I still think Java is sometimes a good choice for Lambda apps. I especially think this when a Java-savvy team is thinking about switching to Lambda - I don’t think they need to learn a new platform and a new language at the same time.
But there are definitely some scenarios where Java is more suited than others. So to be more prescriptive, I recommend teams use Java where some or all of the following apply.
If a team is already writing a good amount of Java (or another JVM language like Scala, Clojure, or Kotlin) then they’re likely to be fine using their existing language skills in a Lambda environment. In this case “problems” 2 and 3 from my earlier list don’t apply (and I’ll get on to “problem” 1 in a moment).
A Java team may even be able to use their existing code, as well as their experience. Lambda applications are very simple from a code-requirement perspective - mostly because they have a very small interface with the Lambda platform (just implementing one method signature). Lambda applications do have different architectural requirements than traditional environments - for example, state management usually needs a re-think - but typically this still allows for re-use of business logic code.
Should a team use Lambda with Java if they are not already using Java? I’d normally say “no” since the language that they’re using itself is probably perfectly fine in Lambda. But there’s one place where I might say otherwise…
There are various ways of measuring the performance of an application. We may care about latency - how quickly can the system respond - especially for UI-oriented applications. But in other applications - especially large, “back end”, message processing applications - throughput is far more important. The difference between being able to handle 100 million events per hour vs 200 million might be the difference in finishing a job in time.
In such contexts picking a technology that is sufficiently fast is important. Lambda is frequently excellent in these scenarios since it can scale very wide with little effort. But if each event itself requires complex processing the actual runtime performance might be something you care about too. In such scenarios Java is a compelling choice - modern JVMs are extremely quick, rivaling native code once they’ve warmed up. This is especially evident when comparing with interpreted languages like Python.
Because of this it’s useful to throw Java into the mix of possible languages when such performance is crucial - along with others like Go, Rust, etc. It’s likely that a team working on these kinds of problems already has experience with at least one of these languages, but if they’re only used to using Python (say) then I suggest that they may want to pick another language for such applications.
But since I’m talking about performance I should also talk about the other side of the coin…
Earlier on I described how Java is more susceptible to problems with the latency impact of cold starts. I also said that usually this isn’t important because the additional latency of a cold start doesn’t detrimentally affect the application, or because the latency is sufficient on aggregate.
However, there are times when this isn’t true. A good, and common, example is a low-throughput API called by a user interface. By “low throughput” here I mean 100 requests per hour, or less. In such a situation cold starts are going to be occurring much more frequently than high throughput apps, on average, and so will be far more noticeable. At this point, the extra 250 - 500ms of cold start time for Java apps is going to start getting painful, especially if there is a “chain” of Lambda functions involved with satisfying a user request.
So I’d say that if you’re writing an application that has this performance requirement in production then perhaps steer clear of Java.
On the other hand if this performance requirement doesn’t hold, then don’t get scared by cold starts, but test to make sure.
Lambda is used in a variety of scenarios. Often it’s for “real” applications - things handling production data. But sometimes it’s also used as “glue”, e.g. to load test data, to deploy a particularly nuanced set of infrastructure resources, or as part of a monitoring flow, like publishing an alert to Slack.
If your team is already proficient in Java then most often the biggest hurdle - perceived, or real - to using Lambda with Java is the cold start problem. If this is the only thing stopping you then I have some suggestions of how you might be able to get the performance of Java and Lambda to work out:
Say you really want to use Java and Lambda, but cold starts are still painful even after you’ve made the above changes. Is there anything else you can do?
Yes, but now we get into the realm of “here be dragons”!
One possibility is not to use a regular JVM, but instead to use an “ahead of time compiler”, like GraalVM. GraalVM shifts some of the work that the JVM does at startup to instead happen at build time. To use GraalVM you can’t use the standard Lambda Java runtime, however AWS do provide a demo of how to use it.
And one final option is to take cold starts out of the loop by using Lambda Provisioned Concurrency (PC). PC allows you to specify how many instances of your Lambda function you want at any given time. If you set a PC configuration then AWS will guarantee that that number of function instances have already been “cold started” before any traffic is sent to them, and therefore traffic is not subject to startup delay.
That sounds awesome - why didn’t I say so before? The problem is that PC also has significant downsides: it breaks the cost model of Lambda since you pay even when your functions aren’t active; Lambda functions with PC are slow to redeploy (several minutes or more); if your scale goes above your PC configuration you’re still subject to cold starts; and if you want to “auto scale” your PC setting then doing so requires some fairly complicated infrastructure. But, if you really, absolutely, want to remove cold start times then AWS gives you the option to do so.
I hope I’ve shown you in this article that despite what you may have heard to the contrary, Lambda and Java can work well together. I wouldn’t say that they’re “BFFs”, but perhaps more like “respectful co-workers” - great in a good number of industrial scenarios!
I’d summarize by saying that if you’re part of a team that’s experienced with Java and that you want to experiment with Lambda to build applications, then start with Lambda and Java. Get the feel for what it’s like, and then decide later whether cold starts are going to be a concern. Usually they won’t be, especially for high throughput applications, but if they are then I’ve given you some ideas here about how to bend (or break) the rules.
Do you love thinking about servers?
Most developers don’t. That’s why serverless platforms such as AWS Lambda, which …Read more
Today I tried signing into MyChart because I got an email notification about a new statement (ugh). The log-in …Read more