Java and AWS Lambda: Best of frenemies?

Java and AWS Lambda: Best of frenemies?
[ Guides ]

CockroachDB: The Definitive Guide

We wrote the book on distributed scale. Literally.

Free O'Reilly Book

*Guest post alert! Mike Roberts has been an engineer as well as a CTO. He is the co-author of this O’Reilly Book and is a partner at Symphonia. He blogs at https://blog.symphonia.io/.

AWS Lambda is the cloud service that introduced the concept of “serverless”. No machines or operating systems to manage; automatic and hassle-free scaling; only costs money when it’s doing anything useful; and more besides. Lambda was launched in 2014 - since then people in the know have come to realize that Lambda is a much more effective and relaxing way of running code in the cloud than many alternatives.

I first used Lambda in 2015. At the time I was managing a team that had a very large, very expensive Java application that processed a few tens of millions of messages per day. The problem was our application didn’t scale sufficiently, it was very slow to deploy, and was a massive headache when things went wrong. We needed to replace a critical piece of the company’s ecosystem, with only a skeleton crew of people. What were we to do?

We tried an experiment. Since we had little spare time for managing systems we decided to lean on Amazon. We needed a service to handle many messages, and a compute platform that could efficiently handle oscillating scaling needs. We chose Kinesis and Lambda, and because our team was one that used a variety of Java Virtual Machine (JVM) languages we chose Lambda’s brand new (at the time) Java support.

The result was a resounding success. Sure, Lambda had a few rough edges back then, but our legacy-app headache was solved. I and John Chapin, who architected the system, were so impressed by the capabilities of Amazon’s “low touch” services that we decided to start our own consulting business - Symphonia - where we could help others with this brave new world of software architecture.

It’s now more than 6 years on from our Lambda + Java experiment, and one aspect of it is still something that causes controversy even among serverless converts - Java. Many folks consider Java an unsuitable language for Lambda, while others (like me) think it has its place in the Lambda universe. So should you use Java for your Lambda apps, or should you steer clear? In this article I hope to allow you to decide for yourself.

The “problems” with Java and Lambda

There are many arguments I hear to not use Java with Lambda, but they basically come down to these three:

  1. Startup times are terrible
  2. No-one writes Java anymore
  3. Java is far too verbose for small Lambda functions

As someone who literally co-wrote the book on using Java and Lambda, these points are a little painful. Why did I spend so much time on so many words?

Perhaps before giving up hope it’s worth digging in a little.

The cold start freeze-out

When people say “Startup times are terrible” they are referring to Lambda “cold starts”. Understanding the details of cold starts requires an article, or perhaps even a book (ahem), but the short story is that the Lambda platform occasionally “cold” starts new instances of a Lambda function when they are needed, but the platform does so just-in-time when events occur, rather than pre-emptively. This means that some events experience extra latency.

The problem with Java and cold starts is that it takes longer to start a Java process than (say) a Javascript, Python, or Go process.

However, a couple of relevant questions at this point are “how much slower is Java?” and “does it matter?”. To the first of those I did some research a couple of years ago, and the results were “typically about a quarter to half a second slower”.

The second question though is much more subjective. Back in 2015 cold starts were a lot slower than today, however in our Kinesis+Lambda application we were processing messages that were already at least a minute old, and further we were fine about waiting a few more minutes for a message to be processed. To us, therefore, a cold start of even several seconds was fine.

Another example - I was working with a medium size social network company 3 years ago, and they were switching some of their Scala code (which runs on the JVM) to Lambda. They were very worried about cold starts since they generally wanted sub-second responses, but their cold starts in development were more than a second (not surprisingly, given the extra weight of running Scala and not just vanilla Java). However, it turned out that in production their Lambda functions were being triggered so frequently that their 99.99% response times were adequate. While the cold start times were the same - a second or two - they were only happening once every 100,000 requests or so. For this team even though a few lambda calls were slow, their aggregate performance was perfectly acceptable.

RELATED Build a Simple CRUD Java App with CockroachDB and JDBC

So yes, Java is slower than other languages at startup. But whether that matters depends on what you are building, and what your performance requirements are.

There are also some techniques that can help when you’re on the borderline, but I’ll get to those later.

Does anyone still write Java?

Look at most startups, or articles on new software tools, and you’d think that all software these days was written in Javascript / Typescript, Python, Go, or Rust. I have nothing against these languages, and actually, I’m a recent fan of Typescript. But sometimes the fashionable parts of the internet don’t tell the whole story.

According to various reports, there are probably between 5 and 10 million Java developers in the world, and Java is still in the top 5 most-frequently-used programming languages. I find these numbers interesting, but I also have some amount of skepticism in the surveys that produce them. Much more tangible to me is that AWS themselves are huge users of Java. Not only do they produce their own free distribution of Java, but they also use Java themselves as part of the Lambda service!

So yes, people are still writing Java - many of them, in fact. They just might not be as vocal about it as other people are about other languages.

Public static void main, XML, etc. etc. etc.

Many people who’ve written Java have memories of it being very verbose, at least in relation to more modern languages. This is partly because the Java language was stuck in the doldrums in the late 00s / early 2010s. But even today, even after significant language improvements in recent years, Java still has brevity concerns in comparison to other languages. For example:

  • Java is a statically typed language, where other languages don’t have mandatory static typing
  • Some community “standards” have led to unfortunately convoluted naming conventions. If I never see another AbstractWidgetControllerInterfaceFactoryFactory I’ll be a happier human.
  • The de facto Java build tool (Maven) uses XML, which is even less pleasant to write in than YAML or JSON.

All of which adds up to an amount of “weight” that makes Java unwieldy for small Lambda functions.

However, a good number of Lambda functions aren’t actually small - you can do a lot in 15 minutes with 6 CPU cores and 10GB memory. And further, teams that are used to Java are also used to its shortcomings, and with good IDEs and tooling, are actually typically no worse off than other software teams.

And so while, yes, I would sympathize with someone coming from vanilla Javascript to Java for the first time, I also think for many Java-experienced teams the extra verboseness is not a concern most of the time.

The rules of thumb of when to use Lambda with Java

Most of the code I write these days isn’t Java, but I still think Java is sometimes a good choice for Lambda apps. I especially think this when a Java-savvy team is thinking about switching to Lambda - I don’t think they need to learn a new platform and a new language at the same time.

But there are definitely some scenarios where Java is more suited than others. So to be more prescriptive, I recommend teams use Java where some or all of the following apply.

#1 - When a team is already writing Java

If a team is already writing a good amount of Java (or another JVM language like Scala, Clojure, or Kotlin) then they’re likely to be fine using their existing language skills in a Lambda environment. In this case “problems” 2 and 3 from my earlier list don’t apply (and I’ll get on to “problem” 1 in a moment).

A Java team may even be able to use their existing code, as well as their experience. Lambda applications are very simple from a code-requirement perspective - mostly because they have a very small interface with the Lambda platform (just implementing one method signature). Lambda applications do have different architectural requirements than traditional environments - for example, state management usually needs a re-think - but typically this still allows for re-use of business logic code.

Should a team use Lambda with Java if they are not already using Java? I’d normally say “no” since the language that they’re using itself is probably perfectly fine in Lambda. But there’s one place where I might say otherwise…

#2 - When throughput performance is important

There are various ways of measuring the performance of an application. We may care about latency - how quickly can the system respond - especially for UI-oriented applications. But in other applications - especially large, “back end”, message processing applications - throughput is far more important. The difference between being able to handle 100 million events per hour vs 200 million might be the difference in finishing a job in time.

In such contexts picking a technology that is sufficiently fast is important. Lambda is frequently excellent in these scenarios since it can scale very wide with little effort. But if each event itself requires complex processing the actual runtime performance might be something you care about too. In such scenarios Java is a compelling choice - modern JVMs are extremely quick, rivaling native code once they’ve warmed up. This is especially evident when comparing with interpreted languages like Python.

Because of this it’s useful to throw Java into the mix of possible languages when such performance is crucial - along with others like Go, Rust, etc. It’s likely that a team working on these kinds of problems already has experience with at least one of these languages, but if they’re only used to using Python (say) then I suggest that they may want to pick another language for such applications.

But since I’m talking about performance I should also talk about the other side of the coin…

#3 - When the impact of cold starts on latency is not important

Earlier on I described how Java is more susceptible to problems with the latency impact of cold starts. I also said that usually this isn’t important because the additional latency of a cold start doesn’t detrimentally affect the application, or because the latency is sufficient on aggregate.

However, there are times when this isn’t true. A good, and common, example is a low-throughput API called by a user interface. By “low throughput” here I mean 100 requests per hour, or less. In such a situation cold starts are going to be occurring much more frequently than high throughput apps, on average, and so will be far more noticeable. At this point, the extra 250 - 500ms of cold start time for Java apps is going to start getting painful, especially if there is a “chain” of Lambda functions involved with satisfying a user request.

So I’d say that if you’re writing an application that has this performance requirement in production then perhaps steer clear of Java.

On the other hand if this performance requirement doesn’t hold, then don’t get scared by cold starts, but test to make sure.

#4 - When writing applications and not “glue” scripts

Lambda is used in a variety of scenarios. Often it’s for “real” applications - things handling production data. But sometimes it’s also used as “glue”, e.g. to load test data, to deploy a particularly nuanced set of infrastructure resources, or as part of a monitoring flow, like publishing an alert to Slack.

In these “glue” situations I typically recommend not using Java. The reason is that these small Lambda functions normally just consist of a single file, or script, and honestly the burden of a full Java tool suite is excessive. I find it’s more effective in such cases just to have a small Javascript or Python script instead. Apart from anything else you can edit it in the AWS Web Console at a push - something you definitely can’t do with Java!

Some rules can be bent…

If your team is already proficient in Java then most often the biggest hurdle - perceived, or real - to using Lambda with Java is the cold start problem. If this is the only thing stopping you then I have some suggestions of how you might be able to get the performance of Java and Lambda to work out:

  • Make sure you’ve set the “Memory Size” configuration high enough. Lambda has one primary “performance dial” - named “Memory Size”. However “Memory Size” also linearly adjusts how much CPU you have, which impacts how long your cold starts take. If your Memory Size is set to 256MB you’re going to have big cold start problems whatever you do with a Java Lambda Function. As such I recommend you set Memory Size to at least 1769MB for latency sensitive apps, which gives you one full CPU.
  • Reduce the amount of code in your function artifact. In my research on Lambda Cold Starts, I found that the biggest impact to the speed of a cold start was the size of the function artifact. In other words, the more code the JVM needs to load, the slower it’s going to take. Therefore, to reduce your cold start time, take code out of your artifact! The easiest way to do this is to build custom artifacts per Lambda function, and only have the code and libraries required for each function in each artifact. If necessary you can go further by using “tree shaking” tools which will analyze your code at build time to remove any code that’s not used. Tree shaking usually needs fine tuning though, as well as thorough testing.
  • Reduce what your code is doing at startup. A lot of what happens during startup is down to “the system” starting up - e.g. the JVM starting and loading your code. However, at Cold Start, your Handler function’s Class is also instantiated. If you’re using any frameworks as part of your code instantiation then this is also going to slow down cold start. I recommend, therefore, to not use frameworks like Spring that make heavy use of reflection, since they can add multiple seconds to startup. If you still want to use a “dependency injection” framework you may want to consider Micronaut which performs work at build-time, rather than at JVM startup.
  • Consider tuning the JVM settings. People who have been coding Java for a while know that the JVM has a seemingly infinite number of configuration options to tune performance. AWS picked a standard configuration of these settings that works “well enough” for most contexts, but it’s possible to pick your own. A recent article from Amazon’s Mark Sailes digs into when and how you may want to change these settings to improve cold starts.

…Other rules can be broken

Say you really want to use Java and Lambda, but cold starts are still painful even after you’ve made the above changes. Is there anything else you can do?

Yes, but now we get into the realm of “here be dragons”!

One possibility is not to use a regular JVM, but instead to use an “ahead of time compiler”, like GraalVM. GraalVM shifts some of the work that the JVM does at startup to instead happen at build time. To use GraalVM you can’t use the standard Lambda Java runtime, however AWS do provide a demo of how to use it.

And one final option is to take cold starts out of the loop by using Lambda Provisioned Concurrency (PC). PC allows you to specify how many instances of your Lambda function you want at any given time. If you set a PC configuration then AWS will guarantee that that number of function instances have already been “cold started” before any traffic is sent to them, and therefore traffic is not subject to startup delay. 

That sounds awesome - why didn’t I say so before? The problem is that PC also has significant downsides: it breaks the cost model of Lambda since you pay even when your functions aren’t active; Lambda functions with PC are slow to redeploy (several minutes or more); if your scale goes above your PC configuration you’re still subject to cold starts; and if you want to “auto scale” your PC setting then doing so requires some fairly complicated infrastructure. But, if you really, absolutely, want to remove cold start times then AWS gives you the option to do so.

Java and Lambda … friends?

I hope I’ve shown you in this article that despite what you may have heard to the contrary, Lambda and Java can work well together. I wouldn’t say that they’re “BFFs”,  but perhaps more like “respectful co-workers” - great in a good number of industrial scenarios!

I’d summarize by saying that if you’re part of a team that’s experienced with Java and that you want to experiment with Lambda to build applications, then start with Lambda and Java. Get the feel for what it’s like, and then decide later whether cold starts are going to be a concern. Usually they won’t be, especially for high throughput applications, but if they are then I’ve given you some ideas here about how to bend (or break) the rules.

Keep Reading

How to create a Lambda function with Python and CockroachDB Serverless

Do you love thinking about servers?

Most developers don’t. That’s why serverless platforms such as AWS Lambda, which …

Read more
How to improve application performance using data location

Today I tried signing into MyChart because I got an email notification about a new statement (ugh). The log-in …

Read more
How to build a complete web app with Python and CockroachDB

In this article, we’re building a full-stack web app that simulates a game leaderboard. The idea is to make it as simple …

Read more