Solving problems with Quarkus extensions (1/n)

This is the first post of what I hope will be a series of several articles showing how you can solve complex problems by leveraging the unique Quarkus build infrastructure and extension framework.

First things first, bootstraping a Quarkus extension is easy: in one command, you can get it scaffolded and get started on the actual implementation. But that’s not the subject of this post!

An extension, apart from providing some runtime code to your application, also allows to tweak the build of your application and do all sorts of things at the build level. This is what we will focus on in this series.

Problem of the day: to ensure binary compatibility, the Hub4j GitHub API introduces some bridge methods that confuse Mockito and more specifically ByteBuddy and ultimately make our tests unreliable. How can we solve that?

Some context

You might have heard about my Quarkus GitHub App extension that allows you to develop GitHub Apps based on Quarkus at light speed with very little boilerplate (shameless ad: it is awesome!).

My dear colleague Yoann Rodière (he is awesome too!) wrote some test infrastructure for it based on Mockito (which uses ByteBuddy under the hood). That was all good until we started noticing confusing and non reproducible failures in our tests with Mockito sometimes not actually calling the method we would expect.

The source of the problem is that, to ensure binary compatibility, the Hub4j GitHub API we use in Quarkus GitHub App introduces bridge methods in the bytecode.

For instance, let’s take this method of the GitHub class of the GitHub API:

    @WithBridgeMethods(value = GHUser.class)
    public GHMyself getMyself() throws IOException {
        client.requireCredential();
        return setMyself();
    }

Historically, it used to return a GHUser but, in newer versions, it returns a GHMyself, which broke the binary compatibility.

To restore it and with the help of the @WithBridgeMethods annotation, the GitHub API build will create two methods in the bytecode: one returning GHMyself and one returning GHUser. This is very useful if you have compiled your application with an old version of the GitHub API and you just want to use the new version without recompiling your application. Typically, in the case of Jenkins, you can switch to a new version of the GitHub API without having to recompile all the Jenkins plugins using GitHub API.

At the bytecode level, you end up with something equivalent to the following:

    public GHMyself getMyself() throws IOException {
        client.requireCredential();
        return setMyself();
    }

    public GHUser getMyself() throws IOException {
        return getMyself(); (1)
    }
1 invokevirtual of getMyself() that returns GHMyself

And if your existing compiled code calls GHUser getMyself(), it will still work after the change of return type.

This bridge methods approach solves a real problem and it’s not that big of a deal as it’s fully transparent for the developer…​ except when you start using Mockito due to a ByteBuddy issue: ByteBuddy can get confused if there are several methods with the same signature but different return types.

ByteBuddy is an amazing library and this blog post should not be seen as a critique of ByteBuddy. This is an extreme corner case that doesn’t happen with standard bytecode.

This issue was causing our tests to be unreliable because sometimes ByteBuddy was choosing the wrong method to apply Mockito magic.

How can we work around this?

In the case of Quarkus GitHub App, we don’t really care about binary compatibility: when upgrading to a new version of the GitHub API, users will rebuild their application.

So given these bridge methods are problematic, one solution would be to get rid of them.

Obviously, we could fork the GitHub API and avoid generating the bridge methods.

But forking and maintaining a fork forever is definitely not something we should consider if we can avoid it. Especially since we want to continue benefiting from all the future improvements of the GitHub API.

So could we somehow keep the library standard but have Quarkus adjust the bytecode when building the application?

If you are in a rush, the short answer is yes. Now let’s go for the (not so) long answer.

Let’s identify the methods

In Quarkus, we can index the annotations with Jandex so, in a perfect world, we would index the GitHub API jar with Jandex (which we already do for other purposes) and interrogate Jandex to get all the methods annotated with @WithBridgeMethods:

Collection<AnnotationInstance> withBridgeMethodsAnnotations =
    index.getAnnotations(DotName.createSimple(WithBridgeMethods.class.getName));

Unfortunately, @WithBridgeMethods has a CLASS retention policy - which makes perfect sense for its usage - and Jandex only considers annotations with a RUNTIME retention policy.

This limitation will be alleviated in Jandex 3 but, for the time being, we cannot use Jandex.

Unfortunately, until then, we don’t have many options here: we have to list the methods manually.

For more flexibility, we introduced a BuildItem:

public final class GitHubApiClassWithBridgeMethodsBuildItem extends MultiBuildItem {

    private final String className;
    private final Set<String> methodNames;

    GitHubApiClassWithBridgeMethodsBuildItem(String className, String... methodsWithBridges) {
        this.className = className;
        this.methodNames = new HashSet<>(Arrays.asList(methodsWithBridges));
    }

    public String getClassName() {
        return className;
    }

    public Set<String> getMethodsWithBridges() {
        return methodNames;
    }
}

And we will produce a GitHubApiClassWithBridgeMethodsBuildItem for each class:

// ...

classesWithBridgeMethods.produce(new GitHubApiClassWithBridgeMethodsBuildItem(
        "org.kohsuke.github.GHPullRequestCommitDetail$Commit", "getAuthor", "getCommitter"));

// ...

Once this is done, we are able to consume the GitHubApiClassWithBridgeMethodsBuildItem from any Quarkus @BuildStep so this list is generally available to the Quarkus build.

I won’t go into detail on the Quarkus build process but the principle of it is extremely simple:

  • It is composed of build steps (methods annotated with @BuildStep).

  • A build step can consume build items.

  • A build step produces build items.

  • Then it is just a matter of resolving the dependencies of the build steps to get to the final result: your application.

You can learn more about it in the Writing extensions guide.

Removing the methods

Now that we have the list of methods handy, the next step is to remove them.

To manipulate bytecode during the build, Quarkus offers the BytecodeTransformerBuildItem. Adjusting the bytecode of a class is just a matter of producing one for the given class.

For instance, to remove the bridge methods from our GitHub API methods, we add the following build step to our extension:

@BuildStep
void removeCompatibilityBridgeMethodsFromGitHubApi(
        BuildProducer<BytecodeTransformerBuildItem> bytecodeTransformers, (1)
        List<GitHubApiClassWithBridgeMethodsBuildItem> gitHubApiClassesWithBridgeMethods) { (2)

    for (GitHubApiClassWithBridgeMethodsBuildItem gitHubApiClassWithBridgeMethods : gitHubApiClassesWithBridgeMethods) {
        bytecodeTransformers.produce(new BytecodeTransformerBuildItem.Builder()
                .setClassToTransform(gitHubApiClassWithBridgeMethods.getClassName())
                .setVisitorFunction((ignored, visitor) -> new RemoveBridgeMethodsClassVisitor(visitor,
                        gitHubApiClassWithBridgeMethods.getClassName(),
                        gitHubApiClassWithBridgeMethods.getMethodsWithBridges()))
                .build());
    }
}
1 We are going to produce BytecodeTransformerBuildItems.
2 We consume the previously produced GitHubApiClassWithBridgeMethodsBuildItems.

RemoveBridgeMethodsClassVisitor is a classic ASM ClassVisitor that will modify the bytecode:

class RemoveBridgeMethodsClassVisitor extends ClassVisitor {

    private final String className;
    private final Set<String> methodsWithBridges;

    public RemoveBridgeMethodsClassVisitor(ClassVisitor visitor, String className, Set<String> methodsWithBridges) {
        super(Gizmo.ASM_API_VERSION, visitor);

        this.className = className;
        this.methodsWithBridges = methodsWithBridges;
    }

    @Override
    public MethodVisitor visitMethod(int access, String name, String descriptor, String signature, String[] exceptions) {
        if (methodsWithBridges.contains(name) && ((access & Opcodes.ACC_BRIDGE) != 0)
                && ((access & Opcodes.ACC_SYNTHETIC) != 0)) { (1)

            return null; (2)
        }

        return super.visitMethod(access, name, descriptor, signature, exceptions); (3)
    }
}
1 If the method name matches and the method is a bridge and synthetic method…​
2 …​ we remove it from the bytecode by returning null.
3 If not, we just delegate to the superclass method that will incorporate the method in the bytecode.

And that’s it!

During the build process, Quarkus will create a class file containing the modified bytecode and will use it instead of the class coming from the GitHub API jar. Thus the bridge methods we wanted to remove will never be visible to ByteBuddy.

Conclusion

At conferences, we often say that Quarkus is doing things differently from other frameworks and that the magic relies in its innovative build process.

This build process is the key to the low memory footprint and fast startup times of Quarkus.

But it is also a very powerful tool to customize the build of your applications.