Vadym Kazulkin | @VKazulkin |ip.labs GmbH
1 High performance Serverless Java on AWS
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Vadym Kazulkin
ip.labs GmbH Bonn, Germany
Co-Organizer of the Java User Group Bonn
v.kazulkin@gmail.com
@VKazulkin
https://dev.to/vkazulkin
https://github.com/Vadym79/
https://de.slideshare.net/VadymKazulkin/
https://www.linkedin.com/in/vadymkazulkin
https://www.iplabs.de/
Contact
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Java Popularity
Vadym Kazulkin | @VKazulkin | ip.labs GmbH
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
https://distantjob.com/blog/programming-languages-rank/
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Life of the Java
(Serverless)
Developer
on AWS
6
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
▪ Corretto Java 8
▪ With extended long-term support until 2026
▪ Coretto Java 11 (since 2019)
▪ Coretto Java 17 (since April 2023)
▪ Corretto Java 21(since November 2023)
▪ Waiting for the support of Java 25
▪ Only Long Term Support (LTS) by AWS
AWS Java Versions Support for AWS Lambda
7
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
… but
serverless
adoption of
Java looks like
this!
8
Java is a very
fast and
mature
programming
language…
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Percent of AWS Lambda Invocations by Language
2021 vs 2023
https://www.datadoghq.com/state-of-serverless-2021
https://www.datadoghq.com/state-of-serverless/
PHYTON IS THE
MOST POPULAR
LAMDA
RUNTIME
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Developers love Java and will be happy
to use it for Serverless applications
But what are the challenges ?
10
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
▪ “cold start” times (latencies)
▪ memory footprint (high cost in AWS)
Serverless with Java Challenges
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Demo Application
https://github.com/Vadym79/AWSLambdaJavaSnapStart
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
https://docs.aws.amazon.com/apigateway/latest/developerguide/set-up-lambda-proxy-integrations.html#api-gateway-simple-proxy-for-lambda-input-format
API Gateway Proxy Request Event JSON
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
AWS Lambda Function with Java runtime
14
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Challenge No. 1
A Big Cold-Start
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Lambda function lifecycle – a full cold start
16
Sources: Ajay Nair „Become a Serverless Black Belt” https://www.youtube.com/watch?v=oQFORsso2go
Tomasz Łakomy "Notes from Optimizing Lambda Performance for Your Serverless Applications“ https://tlakomy.com/optimizing-lambda-performance-for-serverless-applications
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
▪ When Lambda function has been invoked for the first time
▪ After a new Lambda function was deployed
▪ After the existing Lambda function code was modified and re-deployed
▪ When there are not enough warm execution environments in the pool
▪ More concurrent Lambda invocation requests as execution environments in the pool
▪ When the execution environment was destroyed by AWS
▪ For cost saving reasons as the execution environment wasn’t in use for a long time
▪ For security and other reasons to patch the execution environment(s)
New Lambda function execution environment
required/Lambda function cold starts
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
▪ Start Firecracker VM (execution environment)
▪ AWS Lambda starts the Java runtime
▪ Java runtime loads and initializes Lambda function code
(Lambda handler Java class)
▪ Class loading
▪ Static initializer block of the handler class is executed (i.e. AWS service client
creation)
▪ Runtime dependency injection
▪ Just-in-Time (JIT) compilation
▪ Lambda invokes the handler method
18
Sources: Ajay Nair „Become a Serverless Black Belt” https://www.youtube.com/watch?v=oQFORsso2go
Tomasz Łakomy "Notes from Optimizing Lambda Performance for Your Serverless Applications“ https://tlakomy.com/optimizing-lambda-performance-for-serverless-applications
Michael Hart: „Shave 99.93% off your Lambda bill with this one weird trick“ https://hichaelmart.medium.com/shave-99-93-off-your-lambda-bill-with-this-one-weird-trick-33c0acebb2ea
Lambda function lifecycle
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
AWS Lambda Function with Java runtime
19
Invocation of the
handeRequest
method is the
warm start
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Demo Application
https://github.com/Vadym79/AWSLambdaJavaSnapStart
▪ Lambda has 1024 MB memory setting
▪ Lambda uses x86 architecture
▪ Default (Apache) Http Client for
communication with DynamoDB
▪ 14 MB artifact size, , all dependencies in
the POM file
▪ Java compilation option -
XX:+TieredCompilation -
XX:TieredStopAtLevel=1
▪ Info about the experiments:
▪ Approx. 1 hour duration
▪ Approx. first* 100 cold starts
▪ Approx. first 100.000 warm starts
*after Lambda function being re-deployed
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
https://dev.to/aws-builders/aws-snapstart-part-13-measuring-warm-starts-with-java-21-using-different-lambda-memory-settings-160k
Measurements in
ms
p50 p75 p90 p99 p99.9 max
Amazon Corretto
Java 21 cold start
3158 3214 3270 3428 3601 3725
Amazon Corretto
Java 21 warm start
5,77 6,50 7,81 20,65 90,20 1423,63
Cold and warm starts with Java 21
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
▪ AWS SnapStart
▪ GraalVM (Native Image)
Options To Reduce Cold Start Time
22
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
AWS Lambda
SnapStart
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
▪ Lambda SnapStart for Java can improve startup performance for latency-
sensitive applications
▪ SnapStart is fully managed
AWS Lambda SnapStart
https://docs.aws.amazon.com/lambda/latest/dg/snapstart.html
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
▪ Currently available for Lambda managed Java Runtimes (Java 11, 17 and 21),
Python and .NET
▪ Not available for all other Lambda runtimes:
▪ Docker Container Image
▪ Custom (Lambda) Runtime (a way to ship GraalVM Native Image)
AWS Lambda SnapStart
https://github.com/Vadym79/AWSLambdaJavaDockerImage/
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
AWS SnapStart Deployment & Invocation
26
https://aws.amazon.com/de/blogs/compute/reducing-java-cold-starts-on-aws-lambda-functions-with-snapstart/
Vadym Kazulkin @VKazulkin , ip.labs GmbH
C
Create
Snapshot
Firecracker microVM
create & restore
snapshot
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
AWS SnapStart Deployment & Invocation
https://dev.to/vkazulkin/measuring-java-11-lambda-cold-starts-with-snapstart-part-1-first-impressions-30a4
https://aws.amazon.com/de/blogs/compute/using-aws-lambda-snapstart-with-infrastructure-as-code-and-ci-cd-pipelines/
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
AWS Lambda SnapStart with Priming
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
▪ Pre-load as many Java classes as possible before the SnapStart
takes the snapshot
▪ Java loads classes on demand (lazy-loading)
▪ Pre-initialize as much as possible before the SnapStart takes the
snapshot
▪ Http Clients (Apache, UrlConnection) and JSON Marshallers (Jackson)
require expensive one-time initialization per (Lambda) lifecycle. They both
are used when creating Amazon DynamoDbClient
Ideas behind priming
29
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
AWS SnapStart Deployment & Invocation
30
https://aws.amazon.com/de/blogs/compute/reducing-java-cold-starts-on-aws-lambda-functions-with-snapstart/
Vadym Kazulkin @VKazulkin , ip.labs GmbH
Lambda uses the
CRaC APIs for
runtime hooks
for Priming
C
Create
Snapshot
Firecracker microVM
create & restore snapshot
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
▪ Prime dependencies during initialization phase (when it worth doing)
▪ „Fake“ the calls to pre-initialize „some other expensive stuff“ or
execute some critical code paths (this technique is called Priming)
Priming
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Product Repository DAO class
Expensive initialization of the HTTP Client
Expensive initialization of the Jackson
Marshaller (ObjectMapper)
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
SnapStart Enabled with DynamoDB Request
Priming
33
https://dev.to/aws-builders/measuring-java-11-lambda-cold-starts-with-snapstart-part-5-priming-end-to-end-latency-and-deployment-time-jem
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Lambda SnapStart Priming Guide
▪ SnapStart Priming guide aims
to explain techniques for
priming Java applications.
▪ It assumes a base
understanding of AWS
Lambda, Lambda SnapStart,
and CRaC.
https://github.com/marksailes/snapstart-priming-guide
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
0
500
1000
1500
2000
2500
3000
3500
4000
w/o SnapStart with SnapStart w/o Priming with SnapStart with Priming
Cold starts of Lambda function with Java 21 runtime with
1024 MB memory setting, Apache Http Client, compilation
-XX:+TieredCompilation -XX:TieredStopAtLevel=1
p50 p75 p90 p99 p99.9 max
https://dev.to/aws-builders/aws-snapstart-part-13-measuring-warm-starts-with-java-21-using-different-lambda-memory-settings-160k
ms
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
0,00
5,00
10,00
15,00
20,00
25,00
w/o SnapStart with SnapStart w/o Priming with SnapStart with Priming
Warm starts of Lambda function with Java 21 runtime with
1024 MB memory setting, Apache Http Client compilation -
XX:+TieredCompilation -XX:TieredStopAtLevel=1
p50 p75 p90 p99
https://dev.to/aws-builders/aws-snapstart-part-13-measuring-warm-starts-with-java-21-using-different-lambda-memory-settings-160k
ms
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
0
200
400
600
800
1000
1200
1400
1600
w/o SnapStart with SnapStart w/o Priming with SnapStart with Priming
Warm starts of Lambda function with Java 21 runtime with
1024 MB memory setting, Apache Http Client compilation -
XX:+TieredCompilation -XX:TieredStopAtLevel=1
p99.9 max
https://dev.to/aws-builders/aws-snapstart-part-13-measuring-warm-starts-with-java-21-using-different-lambda-memory-settings-160k
ms
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
AWS Lambda Function with Java runtime
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Product Repository DAO class
39
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Demo Application
40
https://github.com/Vadym79/AWSLambdaJavaSnapStart
▪ Lambda has 1024 MB memory setting
▪ Lambda uses x86 architecture
▪ Default (Apache) Http Client for
communication with DynamoDB
▪ 18 MB artifact size, , all dependencies in
the POM file
▪ Java compilation option -
XX:+TieredCompilation -
XX:TieredStopAtLevel=1
▪ Info about the experiments:
▪ Approx. 1 hour duration
▪ Approx. first* 100 cold starts
▪ Approx. first 100.000 warm starts
*after Lambda function being re-deployed
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Experiment with:
▪ Lambda memory settings
▪ Java compilation options
▪ HTTP Client implementations (sync and async)
▪ Lambda architecture (x86 vs arm64)
▪ Lambda SnapStart (with priming techniques)
To find the right trade-off between Lambda cost and performance for your
particular use case
Lambda Performance Tuning Approaches
41
https://aws.amazon.com/de/blogs/developer/preview-release-of-theaws-sdk-java-2-x-http-client-built-on-apache-httpclient-5-5-x/
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Preview Release of the AWS SDK Java 2.x HTTP Client
built on Apache HttpClient 5.5.x
https://aws.amazon.com/de/blogs/developer/preview-release-of-theaws-sdk-java-2-x-http-client-built-on-apache-httpclient-5-5-x/
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Lambda Deployment Artifact Size
43
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
0
500
1000
1500
2000
2500
3000
3500
4000
4500
w/o SnapStart with SnapStart w/o Priming with SnapStart with Priming
Cold starts of Lambda function with Java 21 runtime
using deployment artifact sizes for p90
p90 small p90 medium p90 big
ms
https://dev.to/aws-builders/aws-snapstart-part-11-measuring-cold-starts-with-java-21-using-different-deployment-artifact-sizes-4g29
▪ Small -137 KB (“Hello World”)
▪ Medium – 14 MB (our sample
application)
▪ Big -50 MB (our sample
application + additional
dependencies other to AWS
services)
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
▪ Less (dependencies, classes) is more
▪ Include only required dependencies (e.g. not the whole AWS SDK 2.0 for Java, but the
dependencies to the clients to be used in Lambda)
▪ Exclude dependencies, which you don‘t need at runtime i.e. test frameworks like Junit
Best Practices & Recommendations
45
<dependency>
<groupId>org.junit.jupiter</groupId>
<artifactId>junit-jupiter-api</artifactId>
<version>5.4.2</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>dynamodb</artifactId>
<version>2.22.2</version>
</dependency>
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>bom</artifactId>
<version>2.22.2</version>
<type>pom</type>
<scope>import</scope>
</dependency>
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Demo Application
46
https://github.com/Vadym79/AWSLambdaJavaSnapStart
▪ Lambda has 1024 MB memory setting
▪ Lambda uses x86 architecture
▪ Default (Apache) Http Client for
communication with DynamoDB
▪ 14 MB artifact size, all dependencies in the
POM file
▪ Java compilation option -
XX:+TieredCompilation -
XX:TieredStopAtLevel=1
▪ Info about the experiments:
▪ Approx. 1 hour duration
▪ Approx. first* 100 cold starts
▪ Approx. first 100.000 warm starts
*after Lambda function being re-deployed
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
AWS SnapStart Deployment & Invocation
https://docs.aws.amazon.com/lambda/latest/dg/snapstart.html
https://aws.amazon.com/de/blogs/compute/reducing-java-cold-starts-on-aws-lambda-functions-with-snapstart/
• Lambda stores function
snapshots in Amazon S3,
dividing them into 512 KB
chunks to optimize
retrieval latency.
• Retrieval latency from
Amazon S3 can take up
to hundreds of
milliseconds for each 512
KB chunk.
• Therefore, Lambda uses
a two-layer cache to
speed-up snapshot
retrieval.
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Storing snapshots for low-latency retrieval at Lambda
scale
48
https://aws.amazon.com/blogs/compute/under-the-hood-how-aws-lambda-snapstart-optimizes-function-startup-latency/
▪ Lambda also maintains a layer one (L1) cache
located on Lambda worker nodes, the (Amazon
EC2) instances handling function invocations.
▪ This layer is available locally, thus it provides the
fastest performance, typically 1 millisecond for a
512 KB chunk.
▪ Functions with more frequent invocations are
more likely to have their snapshot chunks
cached in this layer.
▪ Functions with fewer invocations are
automatically evicted from this cache, because it
is bound by the worker instance disk capacity.
▪ When a snapshot chunk is not available in the
L1 cache, Lambda retrieves the chunk from the
L2 cache layer, if not available there from S3.
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Storing snapshots for low-latency retrieval at Lambda
scale
49
https://aws.amazon.com/blogs/compute/under-the-hood-how-aws-lambda-snapstart-optimizes-function-startup-latency/
▪ Resuming execution from snapshots with
low latency is the final SnapStart stage. This
involves loading the retrieved snapshot
chunks into your function execution
environment.
▪ Typically, only a subset of the retrieved
snapshot is needed to serve an invocation.
Storing snapshots as chunks lets Lambda
optimize the resume process by proactively
loading only the necessary subset of
chunks.
▪ To achieve this, Lambda tracks and records
the snapshot chunks that the function
accesses during each function invocation.
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Storing snapshots for low-latency retrieval at Lambda
scale
50
https://aws.amazon.com/blogs/compute/under-the-hood-how-aws-lambda-snapstart-optimizes-function-startup-latency/
▪ After the first function invocation, Lambda
refers to this recorded chunk access data
for subsequent invokes, as shown in the
following figure.
▪ Lambda proactively retrieves and loads this
“working set” of chunks before they are
needed for execution. This significantly
speeds up cold-start latency.
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
▪ The speed of restoring a snapshot depends on its contents, size, and the
caching tier used. As a result, SnapStart performance can vary across
individual functions.
▪ Frequently invoked functions are more likely to have their snapshots
cached in the L1 layer, which provides the fastest retrieval latency.
▪ Infrequently accessed portions of snapshots for functions with sporadic
invokes are less likely to be present in the L1 layer, resulting in slower
retrieval latency from the L2 and S3 cache layers.
▪ Chunk access data for functions with more invocations is also more likely
to be “complete”, which speeds up snapshot restore latency.
SnapStart function performance
https://aws.amazon.com/blogs/compute/under-the-hood-how-aws-lambda-snapstart-optimizes-function-startup-latency/
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
AWS SnapStart tiered cache
52
https://dev.to/aws-builders/aws-snapstart-part-17-impact-of-the-snapshot-tiered-cache-on-the-cold-starts-with-java-21-52ef
• Due to the effect of
snapshot tiered cache, cold
start times reduces with the
number of invocations
• After certain number of
invocations reached the
cold start times becomes
stable
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
AWS Lambda under the Hood
https://www.infoq.com/articles/aws-lambda-under-the-hood/
https://www.infoq.com/presentations/aws-lambda-arch/
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
0
500
1000
1500
2000
2500
with SnapStart w/o Primingwith SnapStart w/o Priming
(last 70)
with SnapStart with
Priming
with SnapStart with
Priming (last 70)
Comparison between all approx.100 vs last 70 cold start of the
Lambda function
p50 p75 p90 p99 p99.9 max
https://dev.to/aws-builders/aws-snapstart-part-17-impact-of-the-snapshot-tiered-cache-on-the-cold-starts-with-java-21-52ef
Due to the effect of
snapshot tiered cache,
cold start times
reduces with the
number of invocations
ms
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
AWS Lambda Profiler Extension for Java
https://github.com/aws/aws-lambda-java-libs/tree/main/experimental/aws-lambda-java-profiler
• The Lambda profiler extension
allows you to profile your Java
functions invoke by invoke, with high
fidelity, and no code changes.
• It uses the async-profiler project to
produce profiling data and
automatically uploads the data as
HTML flame graphs to S3.
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
AWS Lambda Implementation
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
57
https://docs.aws.amazon.com/apigateway/latest/developerguide/set-up-lambda-proxy-integrations.html#api-gateway-simple-proxy-for-lambda-input-format
API Gateway Proxy Request Event JSON
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
AWS Lambda Profiler Extension for Java
https://docs.aws.amazon.com/apigateway/latest/developerguide/set-up-lambda-proxy-integrations.html#api-gateway-simple-proxy-for-lambda-input-format
https://github.com/aws/aws-lambda-java-libs/tree/main/experimental/aws-lambda-java-profiler
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Full Priming including APIGatewayProxyRequestEvent
Deserialization
https://dev.to/aws-heroes/aws-lambda-profiler-extension-for-java-part-2-improving-lambda-performance-with-lambda-snapstart-4p06
This priming technique leads to up to
25% reduction of the cold start times vs.
DynamoDB request priming alone
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
AWS SnapStart Pricing
60
https://aws.amazon.com/lambda/pricing/?nc1=h_ls
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
▪ Avoid saving state that depends on uniqueness during initialization
▪ Avoid UUID uniqueSandboxId = UUID.randomUUID() or long envCreationTime
= System.currentTimeMillis() im Lambda constructor
▪ Use cryptographically secure pseudorandom number generators
▪ Software that always gets random numbers from /dev/random or
/dev/urandom also maintains randomness with SnapStart.
▪ Use java.security.SecureRandom instead of new Random()
▪ Avoid logic relying on time-based caches
AWS SnapStart Challenges around uniqueness
61
https://docs.aws.amazon.com/lambda/latest/dg/snapstart-uniqueness.html
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
▪ SnapStart supports the Java 11, 17 and 21 (Corretto), Python and .NET
managed runtime only
▪ Deployment with SnapStart enabled takes more than 2-2,5 minutes
additionally
▪ Snapshot is deleted from cache if Lambda function is not invoked for 14
days
▪ SnapStart currently does not support :
▪ Provisioned concurrency
▪ Amazon Elastic File System (Amazon EFS)
▪ Ephemeral storage greater than 512 MB
AWS SnapStart Challenges & Limitations
https://docs.aws.amazon.com/lambda/latest/dg/snapstart.html
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
63
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
GraalVM Architecture
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
GraalVM Ahead-of-Time Compilation
Source: Oleg Šelajev, Thomas Wuerthinger, Oracle: “Deep dive into using GraalVM for Java and JavaScript”
https://www.youtube.com/watch?v=a-XEZobXspo
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
AOT vs JIT
Source: „Everything you need to know about GraalVM by Oleg Šelajev & Thomas Wuerthinger” https://www.youtube.com/watch?v=ANN9rxYo5Hg
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Promise: Java Function compiled into a native executable
using GraalVM Native Image significantly reduces
▪ “cold start” times
▪ memory footprint
GraalVM Native Image
67
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
▪ AWS doesn’t provide GraalVM (Native Image) as Java Runtime out of the box
▪ AWS provides Custom Runtime Option
Current Challenges with Native Executable using
GraalVM
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Custom Lambda Runtimes
https://github.com/Vadym79/AWSLambdaGraalVMNativeImage
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
0
500
1000
1500
2000
2500
3000
3500
4000
w/o SnapStart with SnapStart w/o Priming with SnapStart with
Priming
GraalVM 23 Native Image
p50 p75 p90 p99 p99.9 max
https://dev.to/aws-builders/aws-snapstart-part-13-measuring-warm-starts-with-java-21-using-different-lambda-memory-settings-160k
ms
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
0,00
5,00
10,00
15,00
20,00
25,00
w/o SnapStart with SnapStart w/o
Priming
with SnapStart with
Priming
GraalVM 23 Native Image
p50 p75 p90 p99
https://dev.to/aws-builders/aws-snapstart-part-13-measuring-warm-starts-with-java-21-using-different-lambda-memory-settings-160k
ms
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
0
200
400
600
800
1000
1200
1400
1600
w/o SnapStart with SnapStart w/o Priming with SnapStart with
Priming
GraalVM Native Image 23
p99.9 max
https://dev.to/aws-builders/aws-snapstart-part-13-measuring-warm-starts-with-java-21-using-different-lambda-memory-settings-160k
ms
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Frameworks and libraries Ready for GraalVM Native Image
https://www.graalvm.org/native-image/libraries-and-frameworks/
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
GraalVM Native Image
74
https://github.com/Vadym79/AWSLambdaGraalVMNativeImage/blob/master/pure-lambda-graalvm-jdk-21-native-image/src/main/reflect.json
You can run into runtime errors
(ClassNotFoundExceptions )
when configuration is missing
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Particulary logging configuration in GraalVM Native
Image is complex
76
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Log4j natively supports GraalVM Native since 2.0.25
77
https://logging.staged.apache.org/log4j/2.x/graalvm.html
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Assisted Configuration with GraalVM Tracing Agent
https://www.graalvm.org/latest/reference-manual/native-image/metadata/AutomaticMetadataCollection/
https://www.graalvm.org/latest/reference-manual/native-image/guides/configure-with-tracing-agent/
Run the GraalVM tracing
agent during the
execution of your tests
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
▪ GraalVM is really powerful and has a lot of potential
▪ GraalVM Native Image improves cold starts and memory footprint
significantly
▪ GraalVM Native Image is currently not without challenges
▪ Complex GraalVM Native Image configuration files
▪ AWS Lambda Custom Runtime requires Linux executable only
▪ Building Custom Runtime requires some additional effort
▪ e.g. you need a scalable CI/CD pipeline to build memory-intensive native
image
▪ Build time is a factor
▪ You need to carefully test to avoid runtime errors
GraalVM Conclusion
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
▪ With AWS SnapStart and GraalVM Native Image you can reduce cold
start times of the AWS Lambda with Java 21 runtime to the acceptable
values
▪ If you’re willing to accept slightly higher cold and warm start times for
certain the Lambda function(s) and solid priming is applicable -> use fully
managed AWS SnapStart with priming
▪ If a very high performance for certain the Lambda function(s) is really
crucial for your business -> go for GraalVM Native Image
Wrap up and personal suggestions
80
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Powertools for AWS Lambda (Java) v2
https://docs.powertools.aws.dev/lambda/java/2.4.0/ https://github.com/Vadym79/AWSPowertoolsForLambdaJavaV2
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
The Future of GraalVM
82
https://blogs.oracle.com/java/post/detaching-graalvm-from-the-java-ecosystem-train
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Project Leyden
The primary goal of
this Project is to
improve the startup
time, time to peak
performance, and
footprint of Java
programs.
https://www.youtube.com/watch?v=teXijm79vno
https://openjdk.org/projects/leyden/
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Word of Caution
84
Re-measure for your use case!
Even with my examples measurements might
already produce different results due to:
▪ Lambda Amazon Corretto Java 21 managed
runtime minor version changes
▪ Lambda SnapStart snapshot create and
restore improvements
▪ Firecracker microVM improvements
▪ GraalVM (major and minor version) and
Native Image improvements
▪ There are still servers behind Lambda
▪ Java Memory Model impact (L or RAM
caches hits and misses)
▪ Upgrading dependencies (AWS SDK for Java)
tend to make them bigger increasing the
cold start time
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
„AWS Lambda SnapStart „ series
85
https://dev.to/vkazulkin/series/24979
Article series covers the why and
what behind Lambda SnapStart
and priming techniques including
measurements for the cold and
warm starts with different
settings for:
▪ Java 11
▪ Java 17
▪ Java 21
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
“Spring Boot 3.4/ Quarkus 3/ Micronaut 4 application
on AWS Lambda” series
86
Article series covers different
ways to write, run and optimize
Spring Boot 3.4 / Quarkus 3 /
Micronaut 4 applications on AWS
Lambda using:
▪ Managed Java 21 Lambda
runtime + SnapStart+ priming
▪ GraalVM Native Image
Cold and warm start time
measurements are also provided
https://dev.to/vkazulkin/series/30408 https://dev.to/vkazulkin/series/31519 https://dev.to/vkazulkin/series/26067
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
“Data API for Amazon Aurora Serverless v2 with AWS
SDK for Java” series
87
Article series covers pure
Java 21 cold and warm
start time measurements
and optimization
techniques for Amazon
Aurora Serverless v2
database with JDBC and
Data API
https://dev.to/vkazulkin/series/26067
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
“Serverless applications with Java and Aurora DSQL” series
https://dev.to/vkazulkin/series/32326
Article series covers pure
Java 21 cold and warm
start time measurements
and optimization
techniques
(SnapStart+priming vs
GraalVM Native Image)
for Amazon Aurora DSQL
database
Vadym Kazulkin | @VKazulkin |ip.labs GmbH
Thank you

Practical Performance Tuning for Serverless Java on AWS- InfoQ Dev Summit

  • 1.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH 1 High performance Serverless Java on AWS
  • 2.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Vadym Kazulkin ip.labs GmbH Bonn, Germany Co-Organizer of the Java User Group Bonn v.kazulkin@gmail.com @VKazulkin https://dev.to/vkazulkin https://github.com/Vadym79/ https://de.slideshare.net/VadymKazulkin/ https://www.linkedin.com/in/vadymkazulkin https://www.iplabs.de/ Contact
  • 3.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Java Popularity Vadym Kazulkin | @VKazulkin | ip.labs GmbH
  • 4.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH https://distantjob.com/blog/programming-languages-rank/
  • 5.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Life of the Java (Serverless) Developer on AWS 6
  • 6.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ Corretto Java 8 ▪ With extended long-term support until 2026 ▪ Coretto Java 11 (since 2019) ▪ Coretto Java 17 (since April 2023) ▪ Corretto Java 21(since November 2023) ▪ Waiting for the support of Java 25 ▪ Only Long Term Support (LTS) by AWS AWS Java Versions Support for AWS Lambda 7
  • 7.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH … but serverless adoption of Java looks like this! 8 Java is a very fast and mature programming language…
  • 8.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Percent of AWS Lambda Invocations by Language 2021 vs 2023 https://www.datadoghq.com/state-of-serverless-2021 https://www.datadoghq.com/state-of-serverless/ PHYTON IS THE MOST POPULAR LAMDA RUNTIME
  • 9.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Developers love Java and will be happy to use it for Serverless applications But what are the challenges ? 10
  • 10.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ “cold start” times (latencies) ▪ memory footprint (high cost in AWS) Serverless with Java Challenges
  • 11.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Demo Application https://github.com/Vadym79/AWSLambdaJavaSnapStart
  • 12.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH https://docs.aws.amazon.com/apigateway/latest/developerguide/set-up-lambda-proxy-integrations.html#api-gateway-simple-proxy-for-lambda-input-format API Gateway Proxy Request Event JSON
  • 13.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS Lambda Function with Java runtime 14
  • 14.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Challenge No. 1 A Big Cold-Start
  • 15.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Lambda function lifecycle – a full cold start 16 Sources: Ajay Nair „Become a Serverless Black Belt” https://www.youtube.com/watch?v=oQFORsso2go Tomasz Łakomy "Notes from Optimizing Lambda Performance for Your Serverless Applications“ https://tlakomy.com/optimizing-lambda-performance-for-serverless-applications
  • 16.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ When Lambda function has been invoked for the first time ▪ After a new Lambda function was deployed ▪ After the existing Lambda function code was modified and re-deployed ▪ When there are not enough warm execution environments in the pool ▪ More concurrent Lambda invocation requests as execution environments in the pool ▪ When the execution environment was destroyed by AWS ▪ For cost saving reasons as the execution environment wasn’t in use for a long time ▪ For security and other reasons to patch the execution environment(s) New Lambda function execution environment required/Lambda function cold starts
  • 17.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ Start Firecracker VM (execution environment) ▪ AWS Lambda starts the Java runtime ▪ Java runtime loads and initializes Lambda function code (Lambda handler Java class) ▪ Class loading ▪ Static initializer block of the handler class is executed (i.e. AWS service client creation) ▪ Runtime dependency injection ▪ Just-in-Time (JIT) compilation ▪ Lambda invokes the handler method 18 Sources: Ajay Nair „Become a Serverless Black Belt” https://www.youtube.com/watch?v=oQFORsso2go Tomasz Łakomy "Notes from Optimizing Lambda Performance for Your Serverless Applications“ https://tlakomy.com/optimizing-lambda-performance-for-serverless-applications Michael Hart: „Shave 99.93% off your Lambda bill with this one weird trick“ https://hichaelmart.medium.com/shave-99-93-off-your-lambda-bill-with-this-one-weird-trick-33c0acebb2ea Lambda function lifecycle
  • 18.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS Lambda Function with Java runtime 19 Invocation of the handeRequest method is the warm start
  • 19.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Demo Application https://github.com/Vadym79/AWSLambdaJavaSnapStart ▪ Lambda has 1024 MB memory setting ▪ Lambda uses x86 architecture ▪ Default (Apache) Http Client for communication with DynamoDB ▪ 14 MB artifact size, , all dependencies in the POM file ▪ Java compilation option - XX:+TieredCompilation - XX:TieredStopAtLevel=1 ▪ Info about the experiments: ▪ Approx. 1 hour duration ▪ Approx. first* 100 cold starts ▪ Approx. first 100.000 warm starts *after Lambda function being re-deployed
  • 20.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH https://dev.to/aws-builders/aws-snapstart-part-13-measuring-warm-starts-with-java-21-using-different-lambda-memory-settings-160k Measurements in ms p50 p75 p90 p99 p99.9 max Amazon Corretto Java 21 cold start 3158 3214 3270 3428 3601 3725 Amazon Corretto Java 21 warm start 5,77 6,50 7,81 20,65 90,20 1423,63 Cold and warm starts with Java 21
  • 21.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ AWS SnapStart ▪ GraalVM (Native Image) Options To Reduce Cold Start Time 22
  • 22.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS Lambda SnapStart
  • 23.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ Lambda SnapStart for Java can improve startup performance for latency- sensitive applications ▪ SnapStart is fully managed AWS Lambda SnapStart https://docs.aws.amazon.com/lambda/latest/dg/snapstart.html
  • 24.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ Currently available for Lambda managed Java Runtimes (Java 11, 17 and 21), Python and .NET ▪ Not available for all other Lambda runtimes: ▪ Docker Container Image ▪ Custom (Lambda) Runtime (a way to ship GraalVM Native Image) AWS Lambda SnapStart https://github.com/Vadym79/AWSLambdaJavaDockerImage/
  • 25.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS SnapStart Deployment & Invocation 26 https://aws.amazon.com/de/blogs/compute/reducing-java-cold-starts-on-aws-lambda-functions-with-snapstart/ Vadym Kazulkin @VKazulkin , ip.labs GmbH C Create Snapshot Firecracker microVM create & restore snapshot
  • 26.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS SnapStart Deployment & Invocation https://dev.to/vkazulkin/measuring-java-11-lambda-cold-starts-with-snapstart-part-1-first-impressions-30a4 https://aws.amazon.com/de/blogs/compute/using-aws-lambda-snapstart-with-infrastructure-as-code-and-ci-cd-pipelines/
  • 27.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS Lambda SnapStart with Priming
  • 28.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ Pre-load as many Java classes as possible before the SnapStart takes the snapshot ▪ Java loads classes on demand (lazy-loading) ▪ Pre-initialize as much as possible before the SnapStart takes the snapshot ▪ Http Clients (Apache, UrlConnection) and JSON Marshallers (Jackson) require expensive one-time initialization per (Lambda) lifecycle. They both are used when creating Amazon DynamoDbClient Ideas behind priming 29
  • 29.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS SnapStart Deployment & Invocation 30 https://aws.amazon.com/de/blogs/compute/reducing-java-cold-starts-on-aws-lambda-functions-with-snapstart/ Vadym Kazulkin @VKazulkin , ip.labs GmbH Lambda uses the CRaC APIs for runtime hooks for Priming C Create Snapshot Firecracker microVM create & restore snapshot
  • 30.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ Prime dependencies during initialization phase (when it worth doing) ▪ „Fake“ the calls to pre-initialize „some other expensive stuff“ or execute some critical code paths (this technique is called Priming) Priming
  • 31.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Product Repository DAO class Expensive initialization of the HTTP Client Expensive initialization of the Jackson Marshaller (ObjectMapper)
  • 32.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH SnapStart Enabled with DynamoDB Request Priming 33 https://dev.to/aws-builders/measuring-java-11-lambda-cold-starts-with-snapstart-part-5-priming-end-to-end-latency-and-deployment-time-jem
  • 33.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Lambda SnapStart Priming Guide ▪ SnapStart Priming guide aims to explain techniques for priming Java applications. ▪ It assumes a base understanding of AWS Lambda, Lambda SnapStart, and CRaC. https://github.com/marksailes/snapstart-priming-guide
  • 34.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH 0 500 1000 1500 2000 2500 3000 3500 4000 w/o SnapStart with SnapStart w/o Priming with SnapStart with Priming Cold starts of Lambda function with Java 21 runtime with 1024 MB memory setting, Apache Http Client, compilation -XX:+TieredCompilation -XX:TieredStopAtLevel=1 p50 p75 p90 p99 p99.9 max https://dev.to/aws-builders/aws-snapstart-part-13-measuring-warm-starts-with-java-21-using-different-lambda-memory-settings-160k ms
  • 35.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH 0,00 5,00 10,00 15,00 20,00 25,00 w/o SnapStart with SnapStart w/o Priming with SnapStart with Priming Warm starts of Lambda function with Java 21 runtime with 1024 MB memory setting, Apache Http Client compilation - XX:+TieredCompilation -XX:TieredStopAtLevel=1 p50 p75 p90 p99 https://dev.to/aws-builders/aws-snapstart-part-13-measuring-warm-starts-with-java-21-using-different-lambda-memory-settings-160k ms
  • 36.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH 0 200 400 600 800 1000 1200 1400 1600 w/o SnapStart with SnapStart w/o Priming with SnapStart with Priming Warm starts of Lambda function with Java 21 runtime with 1024 MB memory setting, Apache Http Client compilation - XX:+TieredCompilation -XX:TieredStopAtLevel=1 p99.9 max https://dev.to/aws-builders/aws-snapstart-part-13-measuring-warm-starts-with-java-21-using-different-lambda-memory-settings-160k ms
  • 37.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS Lambda Function with Java runtime
  • 38.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Product Repository DAO class 39
  • 39.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Demo Application 40 https://github.com/Vadym79/AWSLambdaJavaSnapStart ▪ Lambda has 1024 MB memory setting ▪ Lambda uses x86 architecture ▪ Default (Apache) Http Client for communication with DynamoDB ▪ 18 MB artifact size, , all dependencies in the POM file ▪ Java compilation option - XX:+TieredCompilation - XX:TieredStopAtLevel=1 ▪ Info about the experiments: ▪ Approx. 1 hour duration ▪ Approx. first* 100 cold starts ▪ Approx. first 100.000 warm starts *after Lambda function being re-deployed
  • 40.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Experiment with: ▪ Lambda memory settings ▪ Java compilation options ▪ HTTP Client implementations (sync and async) ▪ Lambda architecture (x86 vs arm64) ▪ Lambda SnapStart (with priming techniques) To find the right trade-off between Lambda cost and performance for your particular use case Lambda Performance Tuning Approaches 41 https://aws.amazon.com/de/blogs/developer/preview-release-of-theaws-sdk-java-2-x-http-client-built-on-apache-httpclient-5-5-x/
  • 41.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Preview Release of the AWS SDK Java 2.x HTTP Client built on Apache HttpClient 5.5.x https://aws.amazon.com/de/blogs/developer/preview-release-of-theaws-sdk-java-2-x-http-client-built-on-apache-httpclient-5-5-x/
  • 42.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Lambda Deployment Artifact Size 43
  • 43.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH 0 500 1000 1500 2000 2500 3000 3500 4000 4500 w/o SnapStart with SnapStart w/o Priming with SnapStart with Priming Cold starts of Lambda function with Java 21 runtime using deployment artifact sizes for p90 p90 small p90 medium p90 big ms https://dev.to/aws-builders/aws-snapstart-part-11-measuring-cold-starts-with-java-21-using-different-deployment-artifact-sizes-4g29 ▪ Small -137 KB (“Hello World”) ▪ Medium – 14 MB (our sample application) ▪ Big -50 MB (our sample application + additional dependencies other to AWS services)
  • 44.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ Less (dependencies, classes) is more ▪ Include only required dependencies (e.g. not the whole AWS SDK 2.0 for Java, but the dependencies to the clients to be used in Lambda) ▪ Exclude dependencies, which you don‘t need at runtime i.e. test frameworks like Junit Best Practices & Recommendations 45 <dependency> <groupId>org.junit.jupiter</groupId> <artifactId>junit-jupiter-api</artifactId> <version>5.4.2</version> <scope>test</scope> </dependency> <dependency> <groupId>software.amazon.awssdk</groupId> <artifactId>dynamodb</artifactId> <version>2.22.2</version> </dependency> <dependency> <groupId>software.amazon.awssdk</groupId> <artifactId>bom</artifactId> <version>2.22.2</version> <type>pom</type> <scope>import</scope> </dependency>
  • 45.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Demo Application 46 https://github.com/Vadym79/AWSLambdaJavaSnapStart ▪ Lambda has 1024 MB memory setting ▪ Lambda uses x86 architecture ▪ Default (Apache) Http Client for communication with DynamoDB ▪ 14 MB artifact size, all dependencies in the POM file ▪ Java compilation option - XX:+TieredCompilation - XX:TieredStopAtLevel=1 ▪ Info about the experiments: ▪ Approx. 1 hour duration ▪ Approx. first* 100 cold starts ▪ Approx. first 100.000 warm starts *after Lambda function being re-deployed
  • 46.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS SnapStart Deployment & Invocation https://docs.aws.amazon.com/lambda/latest/dg/snapstart.html https://aws.amazon.com/de/blogs/compute/reducing-java-cold-starts-on-aws-lambda-functions-with-snapstart/ • Lambda stores function snapshots in Amazon S3, dividing them into 512 KB chunks to optimize retrieval latency. • Retrieval latency from Amazon S3 can take up to hundreds of milliseconds for each 512 KB chunk. • Therefore, Lambda uses a two-layer cache to speed-up snapshot retrieval.
  • 47.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Storing snapshots for low-latency retrieval at Lambda scale 48 https://aws.amazon.com/blogs/compute/under-the-hood-how-aws-lambda-snapstart-optimizes-function-startup-latency/ ▪ Lambda also maintains a layer one (L1) cache located on Lambda worker nodes, the (Amazon EC2) instances handling function invocations. ▪ This layer is available locally, thus it provides the fastest performance, typically 1 millisecond for a 512 KB chunk. ▪ Functions with more frequent invocations are more likely to have their snapshot chunks cached in this layer. ▪ Functions with fewer invocations are automatically evicted from this cache, because it is bound by the worker instance disk capacity. ▪ When a snapshot chunk is not available in the L1 cache, Lambda retrieves the chunk from the L2 cache layer, if not available there from S3.
  • 48.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Storing snapshots for low-latency retrieval at Lambda scale 49 https://aws.amazon.com/blogs/compute/under-the-hood-how-aws-lambda-snapstart-optimizes-function-startup-latency/ ▪ Resuming execution from snapshots with low latency is the final SnapStart stage. This involves loading the retrieved snapshot chunks into your function execution environment. ▪ Typically, only a subset of the retrieved snapshot is needed to serve an invocation. Storing snapshots as chunks lets Lambda optimize the resume process by proactively loading only the necessary subset of chunks. ▪ To achieve this, Lambda tracks and records the snapshot chunks that the function accesses during each function invocation.
  • 49.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Storing snapshots for low-latency retrieval at Lambda scale 50 https://aws.amazon.com/blogs/compute/under-the-hood-how-aws-lambda-snapstart-optimizes-function-startup-latency/ ▪ After the first function invocation, Lambda refers to this recorded chunk access data for subsequent invokes, as shown in the following figure. ▪ Lambda proactively retrieves and loads this “working set” of chunks before they are needed for execution. This significantly speeds up cold-start latency.
  • 50.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ The speed of restoring a snapshot depends on its contents, size, and the caching tier used. As a result, SnapStart performance can vary across individual functions. ▪ Frequently invoked functions are more likely to have their snapshots cached in the L1 layer, which provides the fastest retrieval latency. ▪ Infrequently accessed portions of snapshots for functions with sporadic invokes are less likely to be present in the L1 layer, resulting in slower retrieval latency from the L2 and S3 cache layers. ▪ Chunk access data for functions with more invocations is also more likely to be “complete”, which speeds up snapshot restore latency. SnapStart function performance https://aws.amazon.com/blogs/compute/under-the-hood-how-aws-lambda-snapstart-optimizes-function-startup-latency/
  • 51.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS SnapStart tiered cache 52 https://dev.to/aws-builders/aws-snapstart-part-17-impact-of-the-snapshot-tiered-cache-on-the-cold-starts-with-java-21-52ef • Due to the effect of snapshot tiered cache, cold start times reduces with the number of invocations • After certain number of invocations reached the cold start times becomes stable
  • 52.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS Lambda under the Hood https://www.infoq.com/articles/aws-lambda-under-the-hood/ https://www.infoq.com/presentations/aws-lambda-arch/
  • 53.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH 0 500 1000 1500 2000 2500 with SnapStart w/o Primingwith SnapStart w/o Priming (last 70) with SnapStart with Priming with SnapStart with Priming (last 70) Comparison between all approx.100 vs last 70 cold start of the Lambda function p50 p75 p90 p99 p99.9 max https://dev.to/aws-builders/aws-snapstart-part-17-impact-of-the-snapshot-tiered-cache-on-the-cold-starts-with-java-21-52ef Due to the effect of snapshot tiered cache, cold start times reduces with the number of invocations ms
  • 54.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS Lambda Profiler Extension for Java https://github.com/aws/aws-lambda-java-libs/tree/main/experimental/aws-lambda-java-profiler • The Lambda profiler extension allows you to profile your Java functions invoke by invoke, with high fidelity, and no code changes. • It uses the async-profiler project to produce profiling data and automatically uploads the data as HTML flame graphs to S3.
  • 55.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS Lambda Implementation
  • 56.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH 57 https://docs.aws.amazon.com/apigateway/latest/developerguide/set-up-lambda-proxy-integrations.html#api-gateway-simple-proxy-for-lambda-input-format API Gateway Proxy Request Event JSON
  • 57.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS Lambda Profiler Extension for Java https://docs.aws.amazon.com/apigateway/latest/developerguide/set-up-lambda-proxy-integrations.html#api-gateway-simple-proxy-for-lambda-input-format https://github.com/aws/aws-lambda-java-libs/tree/main/experimental/aws-lambda-java-profiler
  • 58.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Full Priming including APIGatewayProxyRequestEvent Deserialization https://dev.to/aws-heroes/aws-lambda-profiler-extension-for-java-part-2-improving-lambda-performance-with-lambda-snapstart-4p06 This priming technique leads to up to 25% reduction of the cold start times vs. DynamoDB request priming alone
  • 59.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS SnapStart Pricing 60 https://aws.amazon.com/lambda/pricing/?nc1=h_ls
  • 60.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ Avoid saving state that depends on uniqueness during initialization ▪ Avoid UUID uniqueSandboxId = UUID.randomUUID() or long envCreationTime = System.currentTimeMillis() im Lambda constructor ▪ Use cryptographically secure pseudorandom number generators ▪ Software that always gets random numbers from /dev/random or /dev/urandom also maintains randomness with SnapStart. ▪ Use java.security.SecureRandom instead of new Random() ▪ Avoid logic relying on time-based caches AWS SnapStart Challenges around uniqueness 61 https://docs.aws.amazon.com/lambda/latest/dg/snapstart-uniqueness.html
  • 61.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ SnapStart supports the Java 11, 17 and 21 (Corretto), Python and .NET managed runtime only ▪ Deployment with SnapStart enabled takes more than 2-2,5 minutes additionally ▪ Snapshot is deleted from cache if Lambda function is not invoked for 14 days ▪ SnapStart currently does not support : ▪ Provisioned concurrency ▪ Amazon Elastic File System (Amazon EFS) ▪ Ephemeral storage greater than 512 MB AWS SnapStart Challenges & Limitations https://docs.aws.amazon.com/lambda/latest/dg/snapstart.html
  • 62.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH 63
  • 63.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH GraalVM Architecture
  • 64.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH GraalVM Ahead-of-Time Compilation Source: Oleg Šelajev, Thomas Wuerthinger, Oracle: “Deep dive into using GraalVM for Java and JavaScript” https://www.youtube.com/watch?v=a-XEZobXspo
  • 65.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AOT vs JIT Source: „Everything you need to know about GraalVM by Oleg Šelajev & Thomas Wuerthinger” https://www.youtube.com/watch?v=ANN9rxYo5Hg
  • 66.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Promise: Java Function compiled into a native executable using GraalVM Native Image significantly reduces ▪ “cold start” times ▪ memory footprint GraalVM Native Image 67
  • 67.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ AWS doesn’t provide GraalVM (Native Image) as Java Runtime out of the box ▪ AWS provides Custom Runtime Option Current Challenges with Native Executable using GraalVM
  • 68.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Custom Lambda Runtimes https://github.com/Vadym79/AWSLambdaGraalVMNativeImage
  • 69.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH 0 500 1000 1500 2000 2500 3000 3500 4000 w/o SnapStart with SnapStart w/o Priming with SnapStart with Priming GraalVM 23 Native Image p50 p75 p90 p99 p99.9 max https://dev.to/aws-builders/aws-snapstart-part-13-measuring-warm-starts-with-java-21-using-different-lambda-memory-settings-160k ms
  • 70.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH 0,00 5,00 10,00 15,00 20,00 25,00 w/o SnapStart with SnapStart w/o Priming with SnapStart with Priming GraalVM 23 Native Image p50 p75 p90 p99 https://dev.to/aws-builders/aws-snapstart-part-13-measuring-warm-starts-with-java-21-using-different-lambda-memory-settings-160k ms
  • 71.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH 0 200 400 600 800 1000 1200 1400 1600 w/o SnapStart with SnapStart w/o Priming with SnapStart with Priming GraalVM Native Image 23 p99.9 max https://dev.to/aws-builders/aws-snapstart-part-13-measuring-warm-starts-with-java-21-using-different-lambda-memory-settings-160k ms
  • 72.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Frameworks and libraries Ready for GraalVM Native Image https://www.graalvm.org/native-image/libraries-and-frameworks/
  • 73.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH GraalVM Native Image 74 https://github.com/Vadym79/AWSLambdaGraalVMNativeImage/blob/master/pure-lambda-graalvm-jdk-21-native-image/src/main/reflect.json You can run into runtime errors (ClassNotFoundExceptions ) when configuration is missing
  • 74.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Particulary logging configuration in GraalVM Native Image is complex 76
  • 75.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Log4j natively supports GraalVM Native since 2.0.25 77 https://logging.staged.apache.org/log4j/2.x/graalvm.html
  • 76.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Assisted Configuration with GraalVM Tracing Agent https://www.graalvm.org/latest/reference-manual/native-image/metadata/AutomaticMetadataCollection/ https://www.graalvm.org/latest/reference-manual/native-image/guides/configure-with-tracing-agent/ Run the GraalVM tracing agent during the execution of your tests
  • 77.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ GraalVM is really powerful and has a lot of potential ▪ GraalVM Native Image improves cold starts and memory footprint significantly ▪ GraalVM Native Image is currently not without challenges ▪ Complex GraalVM Native Image configuration files ▪ AWS Lambda Custom Runtime requires Linux executable only ▪ Building Custom Runtime requires some additional effort ▪ e.g. you need a scalable CI/CD pipeline to build memory-intensive native image ▪ Build time is a factor ▪ You need to carefully test to avoid runtime errors GraalVM Conclusion
  • 78.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ With AWS SnapStart and GraalVM Native Image you can reduce cold start times of the AWS Lambda with Java 21 runtime to the acceptable values ▪ If you’re willing to accept slightly higher cold and warm start times for certain the Lambda function(s) and solid priming is applicable -> use fully managed AWS SnapStart with priming ▪ If a very high performance for certain the Lambda function(s) is really crucial for your business -> go for GraalVM Native Image Wrap up and personal suggestions 80
  • 79.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Powertools for AWS Lambda (Java) v2 https://docs.powertools.aws.dev/lambda/java/2.4.0/ https://github.com/Vadym79/AWSPowertoolsForLambdaJavaV2
  • 80.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH The Future of GraalVM 82 https://blogs.oracle.com/java/post/detaching-graalvm-from-the-java-ecosystem-train
  • 81.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Project Leyden The primary goal of this Project is to improve the startup time, time to peak performance, and footprint of Java programs. https://www.youtube.com/watch?v=teXijm79vno https://openjdk.org/projects/leyden/
  • 82.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Word of Caution 84 Re-measure for your use case! Even with my examples measurements might already produce different results due to: ▪ Lambda Amazon Corretto Java 21 managed runtime minor version changes ▪ Lambda SnapStart snapshot create and restore improvements ▪ Firecracker microVM improvements ▪ GraalVM (major and minor version) and Native Image improvements ▪ There are still servers behind Lambda ▪ Java Memory Model impact (L or RAM caches hits and misses) ▪ Upgrading dependencies (AWS SDK for Java) tend to make them bigger increasing the cold start time
  • 83.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH „AWS Lambda SnapStart „ series 85 https://dev.to/vkazulkin/series/24979 Article series covers the why and what behind Lambda SnapStart and priming techniques including measurements for the cold and warm starts with different settings for: ▪ Java 11 ▪ Java 17 ▪ Java 21
  • 84.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH “Spring Boot 3.4/ Quarkus 3/ Micronaut 4 application on AWS Lambda” series 86 Article series covers different ways to write, run and optimize Spring Boot 3.4 / Quarkus 3 / Micronaut 4 applications on AWS Lambda using: ▪ Managed Java 21 Lambda runtime + SnapStart+ priming ▪ GraalVM Native Image Cold and warm start time measurements are also provided https://dev.to/vkazulkin/series/30408 https://dev.to/vkazulkin/series/31519 https://dev.to/vkazulkin/series/26067
  • 85.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH “Data API for Amazon Aurora Serverless v2 with AWS SDK for Java” series 87 Article series covers pure Java 21 cold and warm start time measurements and optimization techniques for Amazon Aurora Serverless v2 database with JDBC and Data API https://dev.to/vkazulkin/series/26067
  • 86.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH “Serverless applications with Java and Aurora DSQL” series https://dev.to/vkazulkin/series/32326 Article series covers pure Java 21 cold and warm start time measurements and optimization techniques (SnapStart+priming vs GraalVM Native Image) for Amazon Aurora DSQL database
  • 87.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Thank you