Step 06 - Multimodal Agents

New Requirement: Visual Car Inspection

In Step 5, you implemented the Human-in-the-Loop pattern for safe, controlled disposition decisions. The system relies entirely on textual feedback from employees returning cars. But what if the person returning the car could also upload a photo?

The Miles of Smiles management team wants to enhance the rental return process:

Allow employees to optionally upload an image of the car when returning it, so the system can automatically enrich the rental feedback with visual observations.

In this step you’ll learn how to integrate multimodal capabilities (text + image) into your existing agentic workflow using LangChain4j’s ImageContent to enrich the rental feedback with visual insights.

What You’ll Learn

In this step, you will:

Add image upload to the rental return form using multipart form data
Convert uploaded images to LangChain4j’s ImageContent for multimodal processing
Create a CarImageAnalysisAgent that analyzes car images and enriches rental feedback
Integrate the new agent at the beginning of the existing CarProcessingWorkflow sequence
Understand how ImageContent flows through agent parameters using @UserMessage
Understand how optional agents can be used to handle the absence of an input and skip the work of the agent
See how the agent gracefully handles the absence of an image, returning the feedback unchanged

Understanding Multimodal Agents

What is Multimodal Processing?

Multimodal processing allows an AI agent to work with multiple types of content simultaneously — in this case, text and images. Instead of just reading feedback like “the car has some damage”, the agent can also see the car and identify specific issues.

How LangChain4j Handles Images

LangChain4j provides the ImageContent class to represent image data in messages sent to the LLM:

ImageContent wraps an image (as base64-encoded data with a MIME type) as a content part
When passed as a method parameter annotated with @UserMessage, it is automatically included alongside text in the message sent to the LLM
The LLM receives both the text prompt and the image, enabling visual reasoning

Rather than creating a separate “image analysis” output, the CarImageAnalysisAgent will use an enrichment pattern:

Receive the original rental feedback text and an optional car image
If an image is present, analyze it and append visual observations to the feedback
If no image is present, return the feedback unchanged
The enriched feedback then flows into the existing FeedbackAnalysisWorkflow

What Are We Going to Build?

We’re enhancing the car management system with multimodal image analysis:

Update the UI to add an image upload field for rented cars in the Fleet Status grid
Modify the REST endpoint to accept multipart form data with an optional image
Transform the uploaded file into a LangChain4j ImageContent object
Create a CarImageAnalysisAgent that analyzes car images
Insert the new agent at the beginning of the workflow sequence

The Updated Architecture:

graph TB
    Start(["Car Return with optional image"]) --> A["CarProcessingWorkflow<br/>Sequential"]

    A --> IMG["Step 1: CarImageAnalysisAgent<br/>Image Analysis"]
    IMG -->|"enriched rentalFeedback"| B["Step 2: FeedbackAnalysisWorkflow<br/>Parallel Mapper"]
    B --> B1["FeedbackTask.cleaning()"]
    B --> B2["FeedbackTask.maintenance()"]
    B --> B3["FeedbackTask.disposition()"]
    B1 --> BA["FeedbackAnalysisAgent"]
    B2 --> BA
    B3 --> BA
    BA --> BEnd["FeedbackAnalysisResults"]

    BEnd --> C["Step 3: FleetSupervisorAgent<br/>Autonomous Orchestration"]
    C --> CEnd["Supervisor Decision"]

    CEnd --> D["Step 4: CarConditionFeedbackAgent<br/>Final Summary"]
    D --> End(["Updated Car"])

    style A fill:#90EE90,stroke:#333,stroke-width:2,color:#000
    style IMG fill:#E8B4F8,stroke:#333,stroke-width:2,color:#000
    style B fill:#87CEEB,stroke:#333,stroke-width:2,color:#000
    style C fill:#FFB6C1,stroke:#333,stroke-width:2,color:#000
    style D fill:#90EE90,stroke:#333,stroke-width:2,color:#000
    style Start fill:#E8E8E8,stroke:#333,stroke-width:2,color:#000
    style End fill:#E8E8E8,stroke:#333,stroke-width:2,color:#000

Hold "Alt" / "Option" to enable pan & zoom

Prerequisites

Before starting:

Completed Step 05 — This step builds on Step 5’s architecture
Application from Step 05 is stopped (Ctrl+C)
Understanding of the existing CarProcessingWorkflow sequence

Update the UI for Image Upload

Update the JavaScript

The action cell for all actionable cars in populateFleetStatusTable now includes a file input for optional image upload:

app.js (action cell in populateFleetStatusTable)

if (car.status === 'RENTED' || car.status === 'AT_CLEANING' || car.status === 'IN_MAINTENANCE') {
    actionCell = `
        <td>
            <form onsubmit="processFeedback(event, ${car.id}, '${car.status}')">
                <input type="file" id="car-image-${car.id}" accept="image/*">
                <input type="text" class="feedback-input" id="feedback-${car.id}" placeholder="Enter feedback">
                <button type="submit" class="return-button">Return</button>
            </form>
        </td>`;
}

The processFeedback function is updated to send a FormData object (multipart) instead of a simple query parameter, and now uses a single consolidated endpoint for all car returns:

app.js (processFeedback with FormData)

const imageInput = document.getElementById(`car-image-${carId}`);
const formData = new FormData();
formData.append('feedback', feedback);
if (imageInput && imageInput.files.length > 0) {
    formData.append('carImage', imageInput.files[0]);
}

fetch(`/car-management/return/${carId}`, {
    method: 'POST',
    body: formData
})

Update the REST Endpoint

Accept Multipart Form Data

Update src/main/java/com/carmanagement/resource/CarManagementResource.java to accept the image as a FileUpload and convert it to ImageContent:

CarManagementResource.java hl_lines=

package com.carmanagement.resource;

import jakarta.inject.Inject;
import jakarta.ws.rs.Consumes;
import jakarta.ws.rs.GET;
import jakarta.ws.rs.POST;
import jakarta.ws.rs.Path;
import jakarta.ws.rs.Produces;
import jakarta.ws.rs.core.MediaType;
import jakarta.ws.rs.core.Response;

import java.io.IOException;
import java.nio.file.Files;
import java.util.Base64;

import org.jboss.resteasy.reactive.RestForm;
import org.jboss.resteasy.reactive.multipart.FileUpload;

import dev.langchain4j.data.message.ImageContent;
import io.quarkus.logging.Log;
import io.smallrye.common.annotation.Blocking;
import io.smallrye.mutiny.Uni;

import com.carmanagement.service.CarManagementService;

/**
 * REST resource for car management operations.
 * Uses blocking processing for AI agent workflows.
 */
@Path("/car-management")
public class CarManagementResource {

    @Inject
    CarManagementService carManagementService;

    /**
     * Process a car return from any status (rental, cleaning, or maintenance).
     * This is a blocking operation due to AI agent processing.
     *
     * @param carNumber The car number
     * @param feedback Optional feedback about the return
     * @param carImage Optional image of the car being returned (multipart form data)
     * @return Uni that completes with the result
     */
    @POST
    @Path("/return/{carNumber}")
    @Consumes(MediaType.MULTIPART_FORM_DATA)
    @Blocking
    public Uni<Response> processReturn(Integer carNumber, @RestForm String feedback, @RestForm FileUpload carImage) {
        ImageContent imageContent = toImageContent(carImage);

        return carManagementService.processCarReturn(carNumber, feedback != null ? feedback : "", imageContent)
            .onItem().transform(result -> Response.ok(result).build())
            .onFailure().recoverWithItem(e -> {
                Log.error(e.getMessage(), e);
                return Response.status(Response.Status.INTERNAL_SERVER_ERROR)
                        .entity("Error processing car return: " + e.getMessage())
                        .build();
            });
    }

    @GET
    @Path("/report")
    @Produces(MediaType.TEXT_HTML)
    public Response report() {
        return Response.ok(carManagementService.report()).build();
    }

    private ImageContent toImageContent(FileUpload fileUpload) {
        if (fileUpload == null || fileUpload.filePath() == null) {
            return null;
        }
        try {
            byte[] bytes = Files.readAllBytes(fileUpload.filePath());
            String base64 = Base64.getEncoder().encodeToString(bytes);
            String mimeType = fileUpload.contentType();
            return new ImageContent(base64, mimeType);
        } catch (IOException e) {
            Log.error("Failed to read uploaded car image", e);
            return null;
        }
    }
}

Let’s break it down:

`@Consumes(MediaType.MULTIPART_FORM_DATA)`

The consolidated return endpoint now consumes multipart form data instead of query parameters, and routes feedback based on the car’s current status:

@POST
@Path("/return/{carNumber}")
@Consumes(MediaType.MULTIPART_FORM_DATA)
@Blocking
public Uni<Response> processReturn(Integer carNumber,
        @RestForm String feedback, @RestForm FileUpload carImage) {

@RestForm: Extracts form fields from the multipart request
FileUpload: RESTEasy Reactive’s type for handling uploaded files
The endpoint looks up the car’s status and routes the feedback to the appropriate parameter

The `toImageContent` Helper

private ImageContent toImageContent(FileUpload fileUpload) {
    if (fileUpload == null || fileUpload.filePath() == null) {
        return null;
    }
    try {
        byte[] bytes = Files.readAllBytes(fileUpload.filePath());
        String base64 = Base64.getEncoder().encodeToString(bytes);
        String mimeType = fileUpload.contentType();
        return new ImageContent(base64, mimeType);
    } catch (IOException e) {
        Log.error("Failed to read uploaded car image", e);
        return null;
    }
}

Reads the uploaded file and converts it to base64-encoded data
Creates an ImageContent with the base64 data and the file’s MIME type (e.g., image/jpeg, image/png)
Falls back to null when no image is provided

Pass the Image Through the Service Layer

Update `src/main/java/com/carmanagement/service/CarManagementService`

Add ImageContent as a parameter and forward it to the workflow:

CarManagementService.java hl_lines=

package com.carmanagement.service;

import jakarta.enterprise.context.ApplicationScoped;
import jakarta.inject.Inject;
import jakarta.transaction.Transactional;

import com.carmanagement.agentic.workflow.CarProcessingWorkflow;
import com.carmanagement.model.CarConditions;
import com.carmanagement.model.CarInfo;
import com.carmanagement.model.CarStatus;
import com.carmanagement.model.FeedbackTask;
import dev.langchain4j.data.message.ImageContent;
import io.quarkus.logging.Log;
import io.smallrye.mutiny.Uni;

import java.util.List;

import static dev.langchain4j.agentic.observability.HtmlReportGenerator.generateReport;

/**
 * Service for managing car returns from various operations.
 * Uses async processing to handle Human-in-the-Loop workflow pauses.
 */
@ApplicationScoped
public class CarManagementService {

    @Inject
    CarProcessingWorkflow carProcessingWorkflow;

    /**
     * Process a car return from any operation.
     * This method runs asynchronously to handle workflow pauses for human approval.
     * 
     * @param carNumber The car number
     * @param feedback Optional feedback
     * @param carImage Optional image of the car
     * @return Uni that completes with the result of the processing
     */
    public Uni<String> processCarReturn(Integer carNumber, String feedback, ImageContent carImage) {

        return Uni.createFrom().item(() -> {
            CarInfo carInfo = findCarInfo(carNumber);
            if (carInfo == null) {
                return "Car not found with number: " + carNumber;
            }

            // Create the list of feedback tasks for parallel analysis
            List<FeedbackTask> tasks = List.of(
                    FeedbackTask.cleaning(),
                    FeedbackTask.maintenance(),
                    FeedbackTask.disposition()
            );

            // Process the car return using the workflow with supervisor
            // This may PAUSE if human approval is needed
            CarConditions carConditions = carProcessingWorkflow.processCarReturn(
                    tasks,
                    carInfo,
                    carNumber,
                    feedback,
                    carImage);

            Log.info("CarConditionFeedbackAgent updating...");

            // Update the car's condition with the result from CarConditionFeedbackAgent
            carInfo.condition = carConditions.generalCondition();

            // Update the car status based on the required action
            switch (carConditions.carAssignment()) {
                case DISPOSITION:
                    carInfo.status = CarStatus.PENDING_DISPOSITION;
                    Log.info("Car marked for disposition - awaiting final decision");
                    break;
                case MAINTENANCE:
                    carInfo.status = CarStatus.IN_MAINTENANCE;
                    break;
                case CLEANING:
                    carInfo.status = CarStatus.AT_CLEANING;
                    break;
                case NONE:
                    carInfo.status = CarStatus.AVAILABLE;
                    break;
            }

            // Persist the changes to the database in a separate transaction
            updateCarInfo(carInfo);

            return carConditions.generalCondition();
        }).runSubscriptionOn(io.smallrye.mutiny.infrastructure.Infrastructure.getDefaultWorkerPool());
    }

    /**
     * Find car info in a read-only transaction
     */
    @Transactional(Transactional.TxType.REQUIRES_NEW)
    CarInfo findCarInfo(Integer carNumber) {
        return CarInfo.findById(carNumber);
    }

    /**
     * Update car info in a separate transaction after workflow completes.
     * Uses merge to handle detached entity from the workflow.
     */
    @Transactional(Transactional.TxType.REQUIRES_NEW)
    void updateCarInfo(CarInfo carInfo) {
        // Merge the detached entity back into the persistence context
        CarInfo.getEntityManager().merge(carInfo);
    }

    public String report() {
        return generateReport(carProcessingWorkflow.agentMonitor());
    }
}

The image is passed straight through to the workflow alongside the feedback string and the carImage parameter:

CarConditions carConditions = carProcessingWorkflow.processCarReturn(
        tasks,
        carInfo,
        carNumber,
        feedback,
        carImage);

Create the CarImageAnalysisAgent

This is the core of this step — a new agent that processes car images.

In src/main/java/com/carmanagement/agentic/agents, create CarImageAnalysisAgent.java:

CarImageAnalysisAgent.java hl_lines=

package com.carmanagement.agentic.agents;

import dev.langchain4j.agentic.Agent;
import dev.langchain4j.data.message.ImageContent;
import dev.langchain4j.service.SystemMessage;
import dev.langchain4j.service.UserMessage;
import dev.langchain4j.service.V;

/**
 * Agent that analyzes a car image and enriches the rental feedback with visual observations.
 * If no image is provided, the rental feedback is returned unchanged.
 */
public interface CarImageAnalysisAgent {

    @SystemMessage("""
        You are a car image analyst for a car rental company.
        You will optionally receive the current rental feedback for a car being returned.
        If an image of the car is provided, analyze it and rewrite the rental feedback taking count of
        your visual observations about the car's condition (e.g., visible damage, scratches, dents,
        cleanliness issues, tire condition, etc.).
        Avoid appending your visual observations in a separated section of the response, but combine
        the existing rental feedback, if present, with what you can see from the image in a single response.
        If no image is provided, or the image is empty or it doesn't seem related to a car,
        simply return the rental feedback exactly as it is, without any modification.
        Your response must always include the original rental feedback text, if present, followed by your observations.
        If no original rental feedback is provided, your response should only include your observations based on the image.
        In any cases the returned response MUST be a single sentence.
        """)
    @UserMessage("""
        Feedback: {feedback}
        """)
    @Agent(description = "Car image analyzer. Enriches rental feedback with visual observations from a car image.",
            outputKey = "feedback", optional = true)
    String analyzeCarImage(String feedback, @UserMessage @V("carImage") ImageContent carImage);
}

Let’s break it down:

The `@SystemMessage`

@SystemMessage("""
    You are a car image analyst for a car rental company.
    You will receive the current rental feedback for a car being returned.
    If an image of the car is provided, analyze it and enrich the rental feedback by appending
    your visual observations about the car's condition (e.g., visible damage, scratches, dents,
    cleanliness issues, tire condition, etc.).
    If no image is provided, return the rental feedback exactly as it is, without any modification.
    Your response must always include the original rental feedback text followed by your observations if any.
    """)

The system message instructs the LLM to:

Analyze the image if one is provided, looking for visible damage, cleanliness issues, etc.
Preserve the original feedback — always include it in the response
Be a no-op when there’s no image — return the feedback unchanged

The `@UserMessage` and `ImageContent` Parameter

@UserMessage("""
    Rental Feedback: {rentalFeedback}
    """)
String analyzeCarImage(String rentalFeedback, @UserMessage @V("carImage") ImageContent carImage);

Note that the @UserMessage annotation on the ImageContent parameter tells LangChain4j to include the image as an additional content part in the user message sent to the LLM. That is a particular usage of the @UserMessage annotation that is specific for multimodal content. The LLM receives both the text template and the image simultaneously, enabling multimodal reasoning. In this case we also need to add the @V annotation to specify the variable name in the template of the UserMessage.

The `outputKey` and the `optional` flag

@Agent(description = "Car image analyzer. Enriches rental feedback with visual observations from a car image.",
        outputKey = "rentalFeedback", optional = true)

The agent’s output key is rentalFeedback, which means its result replaces the rentalFeedback value in the agentic scope. All subsequent agents in the workflow (FeedbackWorkflow, FleetSupervisorAgent, etc.) will automatically receive the enriched feedback. The optional flag is set to true to allow to entirely skip the invocation of an agent if not all of its required parameters are provided; in this case it will be skipped if the image is missing.

Update the Workflow

Add the Agent to the Sequence

Update CarProcessingWorkflow.java to include CarImageAnalysisAgent as the first sub-agent and add the ImageContent parameter:

CarProcessingWorkflow.java

package com.carmanagement.agentic.workflow;

import com.carmanagement.agentic.agents.CarConditionFeedbackAgent;
import com.carmanagement.agentic.agents.CarImageAnalysisAgent;
import com.carmanagement.agentic.agents.FleetSupervisorAgent;
import com.carmanagement.model.CarConditions;
import com.carmanagement.model.CarInfo;
import com.carmanagement.model.FeedbackTask;
import dev.langchain4j.agentic.declarative.Output;
import dev.langchain4j.agentic.declarative.SequenceAgent;
import dev.langchain4j.agentic.observability.MonitoredAgent;
import dev.langchain4j.data.message.ImageContent;
import io.quarkus.logging.Log;

import java.util.List;

/**
 * Workflow for processing car returns using a supervisor agent for complete orchestration.
 * The supervisor coordinates both feedback analysis and action agents.
 */
public interface CarProcessingWorkflow extends MonitoredAgent {

    /**
     * Processes a car return by first analyzing feedback, then using supervisor to coordinate actions.
     * CarImageAnalysisAgent analyzes the car image first.
     * FeedbackAnalysisWorkflow analyzes feedback in parallel and returns FeedbackAnalysisResults via its @Output method.
     * FleetSupervisorAgent uses these results to coordinate action agents.
     * CarConditionFeedbackAgent determines the final car assignment and condition.
     */
    @SequenceAgent(outputKey = "carProcessingAgentResult",
            subAgents = { CarImageAnalysisAgent.class, FeedbackAnalysisWorkflow.class, FleetSupervisorAgent.class, CarConditionFeedbackAgent.class })
    CarConditions processCarReturn(
            List<FeedbackTask> tasks,
            CarInfo carInfo,
            Integer carNumber,
            String feedback,
            ImageContent carImage);

    @Output
    static CarConditions output(CarConditions carConditions) {
        // CarConditionFeedbackAgent now handles all the logic for determining
        // the final car assignment, disposition status, and condition description.
        // We simply pass through its result.

        Log.debug("DEBUG CarConditions output method:");
        Log.debug("  generalCondition: " + carConditions.generalCondition());
        Log.debug("  carAssignment: " + carConditions.carAssignment());
        Log.debug("  dispositionStatus: " + carConditions.dispositionStatus());
        Log.debug("  dispositionReason: " + carConditions.dispositionReason());

        return carConditions;
    }
}

Key Changes:

CarImageAnalysisAgent.class is added as the first sub-agent in the @SequenceAgent
The sequence is now: CarImageAnalysisAgent → FeedbackAnalysisWorkflow → FleetSupervisorAgent → CarConditionFeedbackAgent
ImageContent carImage is added as a new parameter to processCarReturn

The flow is:

CarImageAnalysisAgent analyzes the image and enriches rentalFeedback in the scope
FeedbackAnalysisWorkflow receives the enriched rentalFeedback and runs parallel analysis
The rest of the workflow proceeds as before

Try It Out

Start the Application

Navigate to the step-06 directory:

cd section-2/step-06

Start the application:

Linux / macOSWindows

./mvnw quarkus:dev

mvnw quarkus:dev

Open http://localhost:8080

Test Without an Image

Find the Honda Civic (status: Rented) in the Fleet Status grid and enter feedback without uploading an image:

The car has a small dent on the rear bumper

Click Return.

Expected Result:

The CarImageAnalysisAgent receives the feedback with an empty image
Since there’s no meaningful image, it returns the feedback unchanged
The rest of the workflow processes the original feedback as before

Test With an Image

Find or take a photo of a car (there is a sample image named q4-tree.png in the resources folder, but any car photo will work)
In the Fleet Status grid, find the car and click “Choose File” in its Action column
Select the image
Enter some feedback:

Customer mentioned a minor scratch

Click Return

Expected Result:

The CarImageAnalysisAgent analyzes the image alongside the feedback
It enriches the feedback with visual observations, e.g.: “Customer mentioned a minor scratch. Visual analysis: The image shows a visible scratch on the front left fender, approximately 15cm long. The paint is chipped in the affected area. Additionally, the front bumper shows minor scuff marks on the lower right corner.”
The enriched feedback flows into FeedbackAnalysisWorkflow, which may now detect cleaning, maintenance, or disposition needs that the original text alone wouldn’t have triggered

Check the Agent Report

Click Generate Report to see the execution trace. You’ll see the CarImageAnalysisAgent as the first step in the sequence, with its input (original feedback) and output (enriched feedback).

How It All Works Together

sequenceDiagram
    participant User
    participant UI as Web UI
    participant REST as CarManagementResource
    participant Service as CarManagementService
    participant Workflow as CarProcessingWorkflow
    participant ImageAgent as CarImageAnalysisAgent
    participant FeedbackWF as FeedbackAnalysisWorkflow

    User->>UI: Enter feedback + upload image
    UI->>REST: POST multipart (feedback + image)
    REST->>REST: toImageContent(fileUpload)
    REST->>Service: processCarReturn(..., imageContent)
    Service->>Workflow: processCarReturn(..., carImage)

    rect rgb(232, 180, 248)
    Note over Workflow,ImageAgent: Image Analysis (Step 1)
    Workflow->>ImageAgent: analyzeCarImage(rentalFeedback, carImage)
    ImageAgent->>ImageAgent: LLM analyzes text + image
    ImageAgent->>Workflow: enriched rentalFeedback
    end

    rect rgb(255, 243, 205)
    Note over Workflow,FeedbackWF: Parallel Analysis (Step 2)
    Workflow->>FeedbackWF: Uses enriched rentalFeedback
    par Concurrent Execution
        FeedbackWF->>FeedbackWF: FeedbackAnalysisAgent<br/>with FeedbackTask.cleaning()
    and
        FeedbackWF->>FeedbackWF: FeedbackAnalysisAgent<br/>with FeedbackTask.maintenance()
    and
        FeedbackWF->>FeedbackWF: FeedbackAnalysisAgent<br/>with FeedbackTask.disposition()
    end
    end

    Note over Workflow: Steps 3-4: Supervisor + Condition (unchanged)

Hold "Alt" / "Option" to enable pan & zoom

Key Takeaways

Multimodal agents can process both text and images in a single interaction
ImageContent is LangChain4j’s way to represent images for LLM consumption
@UserMessage on ImageContent parameters automatically includes the image in the message to the LLM
The enrichment pattern (outputKey matching an existing scope variable) allows new agents to augment data without changing downstream code
Optional agent: The agent can be skipped if no image is provided
Multipart form data with @RestForm FileUpload makes image upload straightforward in Quarkus
Base64 encoding is used to convert uploaded files into ImageContent

Experiment Further

1. Try Different Image Types

Upload various car images to see how the agent describes different conditions:

A clean, well-maintained car
A car with visible damage (dents, scratches)
A dirty car (mud, stains)
An interior shot showing wear

2. Compare With and Without Images

Return the same car with identical text feedback but with and without an image. Compare how the downstream agents (cleaning, maintenance, disposition) react differently based on the enriched feedback.

3. Adjust the System Message

Modify the CarImageAnalysisAgent’s system message to focus on specific aspects:

Only report safety-critical damage
Include estimated repair costs
Rate the car’s cleanliness on a scale of 1-10

Troubleshooting

Image not being processed

Verify that:

The file input has accept="image/*" to filter non-image files
The JavaScript correctly appends the file to FormData
The toImageContent method is reading the file and encoding it as base64
Check the server logs for any IOException messages

Agent returns feedback unchanged even with an image

This can happen if:

The image is too small or blank (the LLM sees nothing to analyze)
The MIME type is incorrect — verify fileUpload.contentType() returns a valid image type
The LLM model doesn’t support vision — ensure your configured model supports multimodal input

Request too large

Large images (>10MB) may exceed request size limits. Consider:

Adding accept="image/*" to the file input (already done)
Configuring quarkus.http.body.max-body-size in application.properties if needed
Compressing images client-side before upload

Cleanup

Before moving to the next step, let’s clean up:

Stop the running server by pressing Ctrl+C in the terminal where Quarkus is running
Return to the root project directory:
```
cd ..
```

What’s Next?

You’ve successfully added multimodal image analysis to the car management system!

The system now:

Accepts optional car images during rental returns
Analyzes images using a multimodal LLM agent
Enriches rental feedback with visual observations
Seamlessly integrates with the existing workflow — no downstream changes needed

Key Progression: - Step 4: Sophisticated local orchestration with Supervisor Pattern - Step 5: Human-in-the-Loop for safe, controlled autonomous decisions - Step 6: Multimodal image analysis for enriched feedback

In Step 07, you’ll learn about Agent-to-Agent (A2A) communication — converting the local PricingAgent into a remote service that runs in a separate system, demonstrating how to distribute agent workloads across multiple applications!

Continue to Step 07 - Using Remote Agents (A2A)

Step 06 - Multimodal Agents

New Requirement: Visual Car Inspection

What You’ll Learn

Understanding Multimodal Agents

What is Multimodal Processing?

How LangChain4j Handles Images

What Are We Going to Build?

Prerequisites

Update the UI for Image Upload

Update the JavaScript

Update the REST Endpoint

Accept Multipart Form Data

@Consumes(MediaType.MULTIPART_FORM_DATA)

The toImageContent Helper

Pass the Image Through the Service Layer

Update src/main/java/com/carmanagement/service/CarManagementService

Create the CarImageAnalysisAgent

The @SystemMessage

The @UserMessage and ImageContent Parameter

The outputKey and the optional flag

Update the Workflow

Add the Agent to the Sequence

Try It Out

Start the Application

Test Without an Image

Test With an Image

Check the Agent Report

How It All Works Together

Key Takeaways

Experiment Further

1. Try Different Image Types

2. Compare With and Without Images

3. Adjust the System Message

Troubleshooting

Cleanup

What’s Next?

`@Consumes(MediaType.MULTIPART_FORM_DATA)`

The `toImageContent` Helper

Update `src/main/java/com/carmanagement/service/CarManagementService`

The `@SystemMessage`

The `@UserMessage` and `ImageContent` Parameter

The `outputKey` and the `optional` flag