Lufthansa Technik AVIATAR experiences significant cloud resources savings by moving to Kubernetes-native Quarkus
Lufthansa Technik, the world’s largest independent provider of airline maintenance, repair, and overhaul (MRO) services, runs a SaaS digital platform, called AVIATAR, for the aviation industry. This platform helps airlines avoid delays and cancellations by using data to better organize and schedule maintenance. The company built and operated AVIATAR using a hybrid cloud infrastructure based on enterprise open source software from Red Hat.
In the last 3 years, AVIATAR business has been growing fast and they needed to accommodate the growing demand from their customers. To this end, they had to grow their development force from 5 developers when they first started to over 100 at present. As they grew, they realized that one single large team was not the most productive and efficient way to organize their software development efforts because many developers had to spend time waiting for each other’s work to be finished due to the many interdependencies in the overall system. To address this situation, they decided to split development from one team to several cross-functional teams. At the same time, they also started working on ways to make the newly created smaller development teams more independent by giving each team autonomy to run their own services. This resulted in their evolution into a microservices architecture, where most of their microservices were based on Spring Boot and Java EE. They have gone from 10 services in the beginning to over 100 as of this writing.
The small and autonomous development teams have been able to take responsibility for their own services from development all the way to production to achieve more agility and respond faster to the business.
Lufthansa Technik run their microservices on OpenShift on Azure and, as they were looking at different ways to scale their development efforts, they were also looking at ways to save cloud resource consumption. As they were migrating to microservices, they noticed microservices were consuming high memory and compute cloud resources. For high-availability and emergency procedure purposes, they run at least 3 instances of each microservice on the cloud which means that for each of these microservices, there’s a 3x cloud resource consumption rate. For example, one of their microservices was consuming ½ core plus 1 GB of RAM per instance, which required 1.5 cores and 3 GB of RAM when running it in HA on the cloud (3 instances).
Thorsten Pohl, Product Owner Automation & Platform Architect at Digital Product Division AVIATAR, first heard about Quarkus and its benefits in April 2019 at Red Hat Summit. Among its many benefits, the ones that piqued his interest were its low memory consumption and fast times to first response, both in JVM and native modes. He took this information back to AVIATAR and they decided to try it out. There were two initial microservices that their Technology Council recommended for a Quarkus tryout. The first one would be a brand new microservice called the “Customer Configuration” service, and the second one would be the “Service Discovery” service which would be a migration from a service running in an application server to Quarkus.
This service is for a customer to set up their own settings, e.g. desired level of prediction. This service is targeted for their managed customers and it was selected to be developed in Quarkus because it was low risk. From their perspective, if this service went down, there would be no major impact to customers. It was developed by 2 developers using Quarkus 0.20 in a single sprint (approximately 3 weeks) and they are planning to upgrade it to Quarkus v1.x. This service is currently running in native mode in production.
The Service Discovery service is used to allow automatic discovery between the microservices AVIATAR consist of. It is considered high risk because if it breaks down, it would have a major impact on customers. Also, the original version of this service ran in production as a highly available service in an application server, consuming a lot of cloud resources. The Quarkus version of this service has been running in native mode in Development for about 3 months with no problems and on January 18, 2020, it was deployed to production to replace the instance running on the application server. It should also be mentioned that this Quarkus service started in JVM mode because it was using MongoDB and there was no MongoDB client Quarkus extension when its development started. But as soon as the MongoDB client Quarkus extension became available, they were able to switch the entire service to native mode. This speaks to the fast innovation and new contributions that are part of the Quarkus open source community project.
Lufthansa Technik sees Quarkus as a great fit in their journey to running their services in a serverless mode. They have many services that aren’t invoked that often but they still need to have 3 instances of each up and running all the time for high-availability requirements leading to high cloud resource consumption costs. They plan to turn these seldom accessed services to Function-as-a-service calls so that they can be invoked on demand leading to a reduction of cloud. If they were using Spring Boot, it would take too long to boot up, making it prohibitive to use in a serverless mode.
Likewise, they have experienced lower memory and compute cloud resource consumption when using Quarkus - plus its use of GraalVM - and according to Thorsten, “with Quarkus, they could run 3 times denser deployments without sacrificing availability and response times of services”, as the denser deployments come from the combination of the two technologies.
Their developers are Spring Boot developers with Java EE experience as well, so the learning curve for Quarkus was very small since its syntax and approach was “close to what our developers are already doing and it’s familiar to them. This is a big benefit”, Thorsten affirmed.
With respect to the recently introduced Spring API compatibility feature in Quarkus, Thorsten said that “they may use the Spring API compatibility in Quarkus when migrating their current Spring Boot microservices to Quarkus. However, for developing new microservices, they plan to use just the Quarkus APIs directly because it would be awkward to use another API within Quarkus.”
Among the many benefits that Quarkus provides to Lufthansa Technik, Thorsten mentioned that “they could cut cloud resource costs threefold." And the same goes with OpenShift because of the higher density you can achieve on each core using Quarkus. For example, they had a microservice that was consuming ½ core plus 1 GB of RAM per instance, which required 1.5 cores and 3 GB of RAM when running it in HA on the cloud (3 instances). When using the Quarkus version of the same microservice, its consumption was of 200 millicores plus 200-400 MB of RAM per instance. This translates to 0.6 cores plus 600 MB – 1.2 GB of RAM for an HA deployment of 3 instances of the microservice “. They could run 3 times denser deployments without sacrificing availability and response times of services”, Thornsten reiterated. These are the types of optimizations that can only be achieved by the symbiotic combination of Quarkus and GraalVM.
Thorsten also described Quarkus live coding capabilities as a “really good thing”. Many of their applications have web-based user interfaces and “making changes and reloading pages instantaneously is a great feature”, Thorsten affirmed. Another benefit, already mentioned earlier, was the small Quarkus learning curve experienced by their developers, who are Spring Boot developers with Java EE experience. What makes this possible is the stack of technologies included in Quarkus, composed of best-of-breed and familiar technologies for Kubernetes-native microservices. Some of the Quarkus extensions used by the AVIATAR developers are: Java Web Token (JWT), JAX-RS, MongoDB client, MicroProfile Rest Client, Keycloak (security), Hibernate ORM (for relational databases), MicroProfile Metrics, and MicroProfile Health Check.
They plan to use Quarkus for the development of new services per the guidance of their Technology Council. In general, for new services they’d like to first work on the ones that are low or no risk to customers. They also plan to upgrade their Service Discovery service to Quarkus v1.x and deploy it to production, which actually took place on January 18, 2020. Lastly, they will use the Quarkus APIs directly and for migrating Spring Boot services to Quarkus, they may leverage the Quarkus Spring API compatibility feature.
They look forward to continuing to optimize their cloud resource consumption by using the Quarkus stack in their services.
For more information on Quarkus:
Quarkus website: http://quarkus.io
Quarkus GitHub project: https://github.com/quarkusio/quarkus
Quarkus Twitter: https://twitter.com/QuarkusIO
Quarkus chat: https://quarkusio.zulipchat.com/
Quarkus mailing list: https://groups.google.com/forum/#!forum/quarkus-dev