Project Leyden

You might have heard of Project Leyden, an initiative within the OpenJDK project with ambitious goals.

As Quarkus users, you’ll be wondering how this project will benefit you and how it’s different from GraalVM native images. While we think it’s fair to say that Leyden was inspired or at least motivated by some ideas first implemented in GraalVM’s native images, Leyden is remarkably different. It’s essential to understand how it works: as we will see, Leyden is not a replacement for GraalVM native images but rather a substantial evolution of the JVM, and we expect it to bring some benefits to native images as well.

To try to clarify this, unfortunately, this post is unusually long: we wish it could have been a short guide, "This is how you enable Leyden," but this isn’t quite the time yet, as we need to understand the different models first. Sometimes, the terminology is also different; for example, "Ahead of Time (AOT)" has a very specific meaning in the context of GraalVM native images and has traditionally been associated with "compilation", but in the context of Leyden is used more broadly to indicate a variety of aspects of JVM operation; hopefully, after reading this, it will be less confusing.

Another significant misconception about Leyden is that it’s a project to "improve startup times"; this statement is not wrong, as improving startup times is one of its goals. Yet the other stated goals of the project offer even more significant potential for our favourite platform, Quarkus, and its users.

So, let’s dive in.

What is Leyden?

Project Leyden is an initiative from the OpenJDK team. It is an ongoing experiment that is currently being developed by the joint effort of teams from different companies contributing to the project.

The primary goal of this Project is to improve the startup time, time to peak performance, and footprint of Java programs.

 — Project Leyden, first thing on its project page

Leyden is a general umbrella project to address slow startup and large footprint. It is useful to keep JDK bootstrap times and footprint low. This helps reduce energy consumption, hardware resource use, and, ultimately, monetary costs. However, it’s equally as essential to reduce the time to application peak performance, time usually spent loading application classes and executing application code, including JIT compiling methods on hot code paths. Reducing application footprint can have a tremendous impact and this can be achieved by trimming not just application data but also application classes and code. Leyden is addressing ways that the JVM can help developers achieve those goals; in many ways, this is complementary to the techniques offered by Quarkus at the framework level, so we expect some powerful results from them combined.

Note that the project is evolving rapidly: some of the things explained in this article are evolving while this is written. If you plan on getting involved at a more technical level, follow the development in Jira and the Leyden mailing list.

Why it’s interesting to Quarkus

From a Quarkus perspective, we’ve done a fair job on all such metrics but we’re constantly on the lookout to improve. That’s why Project Leyden got our attention. We’re already working with our colleagues from the OpenJDK team at Red Hat, who are directly involved in implementing Leyden with the wider OpenJDK group: this blog post today is a collaboration among engineers from different teams.

Although Quarkus is already doing a lot of work during the Ahead of Time phase to speed up warmup and response time, the enhancements that Leyden brings to the table are more related to how the JVM behaves. Complementing both approaches, the advantages we can expect from the combination of Quarkus and Leyden are beyond anything you can find with either of them in isolation.

Since the potential for such technological collaboration is strong, the Quarkus and OpenJDK teams are working together on various prototypes and anyone in the community is welcome to join as well.

Refresher on JVM’s bootstrap process

To better understand the scope of the potential improvements, we need to take a step back and discuss how the JVM works today, especially how our application is started and iteratively evolves from interpreting our bytecode to its highest performance mode: running native code which is highly optimized, adapted to the particular hardware, the configuration of the day, and the specific workloads it’s been asked to perform. No other runtime is able to match the JVM on this.

As we all know, a Java runtime does not directly run Java source code. The content of our JAR file is not executable machine code, but Java bytecode generated from Java source code, typically using the javac compiler but in some cases Quarkus will emit directly generated bytecode. A key feature of bytecode is portability, encoding the structure of Java classes and operation of their methods in a machine and operating-system independent format. A Java runtime obeys the type information in the bytecode when laying out Java objects. Execution of a method normally involves interpreting the operations in the method bytecode, although a runtime may also choose to compile method bytecode to equivalent, native machine code and execute the latter directly.

The unit of delivery for bytecode is a class file, which models a single class. The Java runtime itself provides a host of utility and runtime management classes, as class files embedded in either system jars or jmod files. Applications supplement this with their own class files, usually by appending jars to the classpath or module path.

Bytecode is delivered class-at-a-time to allow the runtime to load classes lazily: i.e. the runtime will only lookup, verify and consume a class file when that class’s definition is required to proceed with execution.

Lazy loading is what allows Java to be a dynamic language — i.e. one where the code that is included in the program can be decided at runtime. That might include loading classes from jars identified at runtime, possibly loaded via the network. Alternatively, it might include generating class bytecode at runtime, as is done with proxy classes or service provider auxiliary classes.

Just in Time (JIT) and Ahead of Time (AOT)

Another name to describe Java’s lazy loading is 'Just in Time' (JIT). JIT is a well known term used to describe the operation of Java’s runtime compilers. What is less well known is that it has a much wider use. JIT is not limited to compilation: many other operations performed by the JVM are done lazily at runtime or 'Just In Time'.

An alternative to doing things 'Just in Time' (JIT) is to do them 'Ahead Of Time' (AOT). For example, GraalVM’s Native Image runtime loads and analyses the bytecode of every single class needed by an application, including JDK runtime classes, 'Ahead Of Time' i.e. at image build time. It uses the type and method information encoded in that bytecode to 'Ahead Of Time' compile a complete program that includes code for every method that might possibly be executed by the application.

The approach of GraalVM’s native images lies at one extreme: everything is done AOT, while the traditional Java runtime model lies at the other extreme, as much as possible is done JIT. However, it is actually possible to mix and match AOT and JIT models of execution in one runtime: re-balancing that AOT vs JIT mix is the goal of the first EA release of project Leyden.

Interestingly, this time-shifting concept is also applied by Quarkus; we called it "augmentation" and essentially consists in booting popular frameworks during the build time of the application, to not incur such performance penalties at runtime.

A native image build might also take advantage of Profile Guided Optimisations (PGO), which allows it to leverage some data about what’s presumably happening at runtime back into the compilation process, guiding its optimisations. It’s essentially peeking into the future - another form of time-shifting. However, it’s only peeking into a simulation of runtime metrics, and ultimately, the compiler still needs to make all optimisation tradeoffs Ahead Of Time; this has pros and cons. The primary disadvantage is that any suboptimal decision is cast in stone; luckily there is a fallback mechanism to recover from outright bad decisions, but this mechanism cannot produce new optimal code. The advantage is more decisive for short-lived applications as the tradeoff of carrying all support for JIT optimisations in the runtime is less justifiable when there is barely an opportunity to take advantage of it.

Default Java compilation and run
Figure 1. On a default Java compilation and run, we have two distinct phases: First we compile the source code into bytecode. And then we use that bytecode to run the application.

Class Data Sharing (CDS) as a step to AOT Caching

Shifting work so it is done AOT is not a wholly new idea as far as the OpenJDK runtime is concerned. OpenJDK has supported a hybrid AOT/JIT class loading model for years with CDS. The observation that led to Class Data Sharing (CDS) being proposed was that most applications load the same classes every time they run, both JDK classes during JDK bootstrap and application classes during application startup and warmup.

Loading requires locating a class bytecode file, possibly calling out to a Java ClassLoader, parsing the bytecode then building a JVM-internal model of the class. This internal model unpacks the information packed into the bytecode into a format that enables fast interpreted or compiled execution. If this loading and unpacking work could be done once and the resulting class model efficiently reused on subsequent runs, then that would save time during startup and warm up.

Initially CDS optimized loading for a large set of core JDK classes. It worked by booting the JVM and dumping the class model for all classes loaded during startup into an archive file laid out in memory format. The resulting JDK module, class, field, and method graph can then be quickly remapped into memory next time the JVM runs. Loading a class that is present in the archive involves a simple lookup in the AOT class model. Loading a class not present in the archive requires the normal JIT steps of bytecode lookup, parsing and unpacking i.e. CDS implements a hybrid JIT/AOT execution model.

Static CDS benefits
Figure 2. Static CDS archives are built during the JVM installation and includes classes from the core libraries. This archive can be used to move part of the class loading to AOT when running the application.

A default CDS archive for JDK runtime classes has been shipped with every JVM release since JDK17, halving JDK startup time. Improvements were made to CDS to allow application classes to be included in a CDS archive after executing a short application training run. The resulting mixed AOT/JIT operation can provide significant improvements to application startup and warmup times, depending on how well the training run exercises application code. So, selective JIT/AOT operation is not some new thing.

Dynamic CDS benefits
Figure 3. When doing training runs, we create an archive that contains information on how the application runs. This archive includes not only classes from the core libraries, but also classes from our application.

Quarkus makes it really easy to generate CDS archives specific to your application code; this feature has been around since some years already: see the AppCDS guide in Quarkus. As Leyden is coming, we aim to evolve this further and fully automate it for Leyden as well, so to get you even more benefits at no additional hassle.

The goal of Project Leyden is extending the AOT vs JIT trade-off from class loading (as done by CDS) to other JIT operations in the JVM; there’s a number of operations which could be "moved in time" to AOT, such as creation of heap objects to represent constants, gathering execution profile information, and many more. Most importantly, it’s moving AOT the lazy linking that normally happens during interpreted execution and the lazy compilation and recompilation that happens when methods have been executed enough times to justify the cost of compilation.

AOT vs JIT Linkage

Linking of classes is another operation that the JVM does lazily. When class bytecode is processed the class is directly linked to its owning module and its owned methods and fields. JIT linkage connects elements of each independent, linked class sub-graph into a fully connected graph where elements from different (class or module) files cross-reference each other.

Loading and linking needs to proceed recursively. As one example, every class (except Object) needs to be linked to its super class. Super linkage cannot complete without ensuring the super class is loaded. Indeed, if the super’s bytecode cannot be found or is not valid (say it identifies an interface not a class) then a linkage error may occur. Likewise, a new operation or a field get/put operation occurring in some method’s bytecode can only be linked after loading the class (and field) named in the new bytecode.

Linking is sometimes, but not always, done lazily. Indeed, it is necessary to do some linkage lazily in order to allow loading also to be lazy, otherwise the whole class graph would end up being linked and loaded as soon as the main routine was entered. Super linkage is always done eagerly at the point where the subclass has just been loaded. That is because it is not possible to use a subclass to create instances or execute methods without knowing how the superclass is defined. By contrast, field and method linkage is done lazily. In these cases linkage happens as a side-effect of execution. When a method executes a field get/put or method invoke bytecode for the first time the target field or method is looked up via its owner class, loading it if necessary. The field type or method signature is checked for consistency and details of where to find the field or how to call the method are cached, allowing the next execution of the bytecode to bypass the linkage step.

As with lazy loading, this lazy approach results in almost the exact same linkage being established on every run. The time spent stopping and restarting execution to lazily connect the class graph comprises a noticeable percentage of JDK startup, application startup and application warm up (time to peak running). We could speed up startup and, more crucially, warm up time if we could pre-compute this linkage and avoid the need to establish it at runtime.

Synergy with Quarkus

Loading and linking of classes is an important step in the warm up of the application because it involves searching through the whole classpath for all classes and objects referenced by the bytecode the JVM is going to run. By default, this is done as a lazy operation because loading and linking all existing classes in the classpath would not only require a bigger memory footprint, but also a bigger warm up time. This is why the JVM only compiles and links the bytecode that is going to be used.

This is a process that Quarkus already speeds up by, among other strategies, aggressively reducing the set of classes included in the classpath, so the search for matches is faster. The search for classes is also accelerated by indexes which Quarkus can generate when it fully analyzes the application at build time. But it is still a heavy operation that is difficult to execute ahead of time, before we know what is going to be run and how. Quarkus might be able to provide some additional hints to the linker in the future.

The first improvement Leyden is offering to improve startup time is to upgrade the AOT model originally developed as part of the CDS project to encompass not just pre-loading of classes but also pre-linking, as described in JEP Ahead-of-Time Class Linking.

An AOT Cache can be generated during a training run that bootstraps the JVM and, optionally, executes application-specific code. As with a CDS archive, the AOT Cache stores a class graph for all classes loaded during the training run in a format that allows it to be quickly remapped on a subsequent run. The stored graph also includes any linkage information established by code executed during the training run. Pre-cached links avoid the need to stop and start execution to perform linkage on subsequent runs.

Leyden CDS benefits
Figure 4. Leyden’s AOT Cache contains a lot more pre-generated content that allows us to move part of the load, link, and compiling to AOT, allowing for faster startup and warm up of the application.

Remember that the training run enables some of the loading and linking to be done AOT but that anything not trained for will still be performed via the regular JIT process: the AOT approach is not required to be applied comprehensively, so that the JVM can fallback to the regular loading system for the use cases which can not benefit from AOT processing. This ability to fallback to "regular JIT processing" is a luxury that GraalVM native images can’t use.

JIT vs AOT Compilation

Another well-known lazy operation the JVM performs is JIT (runtime) compilation. Method bytecode is normally interpreted, but the JVM will lazily translate bytecode to equivalent machine code. Since generating optimal machine code is an expensive operation, it performs this compilation task selectively, only bothering to compile methods that have been invoked quite a few times.

JIT compilation is also 'adaptive' i.e. the JVM will lazily recompile some methods, using different 'tiers' or levels of compilation.

  1. A tier 1 compile generates code that is only lightly optimised, based on very limited execution profile data.

  2. A tier 2 compile also generates lightly optimized code but instruments it to profile control flow.

  3. Tier 3 compilation adds further instrumentation that records many more details about what gets executed, including with what type of values.

  4. A tier 4 compile uses all gathered profile information and performs a great deal of optimization.

Tier 1 - 3 compilations omit many possible optimizations in order to deliver compiled code quickly. A tier 4 compilation can take much longer to complete so it is only attempted for a small subset of very frequently executed methods.

Sometimes, the code is compiled with substantial optimisations based on 'speculative' assumptions extrapolated from the profiling data. In such cases, the compiler will make an optimistic assumption about a condition to be consistently true in the future yet include an efficient check to verify the assumption during execution so that the semantics of the program are not affected in case this educated guess eventually turns out to be false; when this is detected, the code is de-optimised, returning at a previous tier of compilation and the profiling data is adjusted, so that it will eventually be recompiled with better information. Essentially, some parts of code might get recompiled multiple times and occasionally revert to a lower tier: it’s an highly dynamic process.

Peak optimization is reached when most of the running code is compiled at the highest tier, and background compilation activities become very rare or, ideally, none at all.

Compiling code for peak performance also requires quite some resources, so performing this work ahead of time can also save precious CPU cycles during the application bootstrap, and can manifest in substantial memory savings as well: Java developers aren’t used to measure the memory costs of the JIT compiler, but the fact that it’s hidden doesn’t imply it’s non-existent; and while this might be a detail for large enterprise servers, it’s quite important to be aware of such resource costs when developing microservices or simply aiming for smaller, more power efficient targets.

But there are some limitations on what we can optimise before runtime just by examining the bytecode. For example, extensive use of reflection prevents the compiler from predicting which symbols will be loaded, linked, and most used at runtime.

The Leyden project has already sucessfully prototyped shifting the work of method compilation from JIT to AOT. Execution and compilation of methods is tracked during the training run. At the end of the run any associated profiling information and compiled code for the method are saved to the AOT Cache, allowing them to be quickly mapped back into memory and reused when the application is next run.

As with AOT loading and linking, the training run enables some of the work of profiling and compiling to be done AOT but allows anything not trained still to be compiled via the regular JIT compilation process. Note that method code does not need to have been compiled at the highest tier in order to be saved. Also, when code compiled at a lower tier is restored it can still be recompiled at a higher level.

Compiled code can also be deoptimized and re-optimized to adapt to different runtime conditions, just as with code compiled in the current runtime. So, the use of AOT compilation is fully integrated into OpenJDK’s adaptive, dynamic compilation and recompilation model: even if some assumptions made during AOT compilation turn out to be suboptimal, the just-in-time compiler can intervene at runtime and improve the code with the new information.

How to play with it

The first step would be to install one of the early Leyden builds that you can find at jdk.java.net/leyden/.

Make sure that you have installed it correctly by running the following command:

$ java --version
openjdk 24-leydenpremain 2025-03-18
OpenJDK Runtime Environment (build 24-leydenpremain+2-8)
OpenJDK 64-Bit Server VM (build 24-leydenpremain+2-8, mixed mode, sharing)

Go to the application you want to test Leyden with and start a first training run:

$ java -XX:CacheDataStore=quarkusapp.aot -jar $YOUR_JAR_FILE

This will generate the archive files with all the profiling information needed to speed up the production run.

Now that we have them, we can run our application using the Leyden enhancements:

$ java -XX:CacheDataStore=quarkusapp.aot -XX:+AOTClassLinking -jar $YOUR_JAR_FILE

Potentially needed workarounds

Since it’s early days for the Leyden project, there are some known issues. The following instructions shouldn’t be necessary for the final versions but you might need them today.

Force the use of G1GC

To benefit from the natively compiled code in AOT archives, the garbage collector used at runtime needs to match the same garbage collector used when you recorded the AOT archives.

Remember that the JVM’s default choice of garbage collector is based on ergonomics; normally this is nice but it can cause some confusion in this case; for example if you build on a large server it will pick G1GC by default, but then when you run the application on a server with constrained memory it would, by default, pick SerialGC.

To avoid this mismatch it’s best to pick a garbage collector explicitly; and since several AOT related optimisations today only apply to G1, let’s enforce the use of G1GC.

Force using G1GC:

-XX:+UseG1GC

N.B. you need to use this consistently on both the process generating the AOT archives and the runtime.

Force the G1 Region sizes

As identified and reported by the Quarkus team to our colleagues working on Project Leyden, beyond enforcing a specific garbage collector, one should also ensure that the code stored in AOT archives is being generated with the same G1 region sizes as what’s going to be used at runtime, or one risks segmentation faults caused by it wrongly identifying regions. See https://bugs.openjdk.org/browse/JDK-8335440 for details, or simply set:

Configure G1HeapRegionSize explicitly:

-XX:G1HeapRegionSize=1048576

N.B. you need to use this consistently on both the process generating the AOT archives and the runtime.

Failure to terminate in containers

This issue has already been resolved, but in case you’re using an older version of project Leyden and it fails to exit on regular container termination, you might be affected by JDK-8333794.

Workaround for JDK-8333794:

-Djdk.console=java.basebroken

Current status of Project Leyden

There are already experimental early-access builds of Leyden that can be tested based on this draft JEP about Ahead-of-Time Class Linking.

With the Leyden Project, the idea of leveraging a "training run" has been extended to a wider range of data structures embedded in the new AOT cache. Now the cache produced by the AOT process contains the following data:

  • Class file events with historical data (Classes loaded and linked, Compilations)

  • Resolution of API points and indy (stored in constant pool images in the AOT archive). If you have lambdas in your code, they are captured here.

  • Pre-created constant objects in the Java heap (String and Class<?> constants)

  • Execution profiles and some compiled native code (all tiers)

Leyden is also a hot topic at the JVM Language Summit this year; as soon as the recordings of the talks about Leyden are publicly available we’ll add the links here.

Some known limitations

This is an experimental project being developed by multiple teams having different approaches and focuses. Limitations explained here are being worked on at the time of writing this blog post.

One of the main issues is that functionality is currently only available for x86_64 and AArch64 architectures at the moment.

Also, current developments rely on a flat classpath. If the application is using custom classloaders, then it may not benefit as much as it could as it may miss caching many classes.

The same happens if the application is intensively using reflection. Quarkus avoids reflection whenever possible, preferring to resolve reflective calls at build time as well - so there’s a nice synergy at play.

However Quarkus in “fast-jar” mode, which is the default packaging mode, will use a custom classloader which currently would get in the way of some Leyden optimisations. One could use a different packaging mode in Quarkus to get more prominent benefits from Leyden, but doing so would disable other Quarkus optimisations, so the comparison wouldn’t be entirely fair today. We hope to work on improvements in this area to have all possible benefits, combined.

The focus on these first early releases has been on bootstrap times. There are measurable, significant startup time improvements, due to AOT loading and linking. In some cases, these improvements on startup time have worsened the memory footprint of some applications. That’s an already known issue that is being worked on, and the expected outcome is to improve memory footprint as well, so we would suggest not worrying too much about total memory consumption at this stage.

Since the AOT archives include machine specific optimisations such as the native code generated by the C2 compiler, the training run and the production run must be done on the same type of hardware and JDK versions; it also requires using the same JAR-based classpaths and the same command line options.

Although the training run can use a different Main class to the one used for running the application, for example a test class that simulates usage.

What is on the roadmap for Leyden?

There’s still work to be done regarding classes that can’t be loaded and linked in AOT with the current implementation. For example, classes loaded using a user-defined class loader. There’s also room to improve the way the training runs are made, maybe allowing the user to tweak the results to influence decisions.

Currently, the Z Garbage Collector does not support AOT object archiving. There is an active effort to make sure all Garbage Collectors are compatible with these enhancements.

There are also other things planned in the roadmap for Leyden, like adding condensers. Condensers will be composable transformers of the source code in AOT that modify the source code optimising it. Each developer will be able to define a pipeline of condensers that improves their source code before compiling it into bytecode; this is very interesting to the Quarkus team but condensers aren’t available yet.

The OpenJDK team is actively extending the range of compiled code that can be saved to and restored from the AOT cache. Our colleagues from Red Hat’s OpenJDK team are directly involved in this effort, looking into save and restore of auxiliary code that is normally generated at runtime and used to provide optimized code for 'intrinsic' methods or to link compiled Java method code to the compiled C code that implements the JVM, the interpreter and other compiled C libraries.

Will Leyden replace GraalVM’s native-image capabilities?

The short answer is no.

If you want the absolute smallest footprint and ensure that absolutely no "dynamic" adaptations happen at runtime, GraalVM native images are the way to go. Just think about it: to support the dynamic aspects that the JVM normally provides, even in very minimal form, you would need some code which is able to perform this work, and some memory and some computational resources to run such code and adapt your runtime safely; this is a complex feature and will never be completely free, even in the case Leyden evolved significantly beyond the current plans.

The architecture of Quarkus enables developers to define an application in strict "closed world" style, and this approach works extremely well in combination with GraalVM native images, but the Quarkus design works indeed very well on the bigger, dynamic JVMs as well.

The ability that Quarkus offers to create a closed world application doesn’t imply that you should necessarily be doing so; in fact there are many applications which could benefit from a bit more dynamism, some more runtime configurability or auto-adaptability, and Quarkus also allows to create such applications while still benefiting from very substantial efficiency improvements over competing architectures, and even over competing runtimes and languages.

We’re very excited by Project Leyden as it allows to substantially improve bootstrap times, warmup times, and overall costs even for the "regular" JVM, so retaining all the benefits of a dynamic runtime and an adaptative JIT compiler, and this will be a fantastic option for all those applications for which a fully AOT native image might not be suitable: you’ll get some of the benefits from native-image (not all of them) but essentially for free, at no drawbacks.

We also hope it will bring better defined semantics in regards to running certain phases “ahead of time” (or later); there is a very interesting read on this topic by Mark Reinhold: Selectively Shifting and Constraining Computation ; from a perspective of Quarkus developers, we can confirm that improvements in the language specification in this area would be very welcome, and also improve the quality and maintainability of applications compiled with GraalVM native-image(s).

For these reasons, Quarkus will definitely not deprecate support for native images; it’s more plausible that, eventually, the "full JVM" will always be benefiting from Leyden powered improvements, and as usual we’ll work to make these benefits work in synergy with our architecture, and at minimal effort for you all.

Essentially both the JVM and the native-image options are bound to benefit from this initiative. It’s a great time to be a Java developer!

How can I make sure this will work for me?

The best way to make sure your application benefits from Leyden is to start experimenting early and participate in the development. It would be great to add real-world feedback from a perspective of Quarkus users.

If you spend some time testing your application with the early-access builds of Leyden, and reporting any bugs or weird behaviour the developers will take your specificities into account.

The OpenJDK issue tracker isn’t open to everyone, but you’re also very welcome to provide feedback on our Quarkus channels; we can then relay any suggestions to our colleagues who are directly working on project Leyden. You can also use the Leyden mailing list.