Step 06 - Multimodal Agents
New Requirement: Visual Car Inspection
In Step 5, you implemented the Human-in-the-Loop pattern for safe, controlled disposition decisions. The system relies entirely on textual feedback from employees returning cars. But what if the person returning the car could also upload a photo?
The Miles of Smiles management team wants to enhance the rental return process:
Allow employees to optionally upload an image of the car when returning it, so the system can automatically enrich the rental feedback with visual observations.
This is a common real-world scenario where:
- Text alone is insufficient: An employee might write “car looks fine” but a photo reveals scratches or dents they missed
- Multimodal AI is powerful: Modern LLMs can analyze images alongside text to provide richer assessments
You’ll learn how to integrate multimodal capabilities (text + image) into your existing agentic workflow using LangChain4j’s ImageContent.
What You’ll Learn
In this step, you will:
- Add image upload to the rental return form using multipart form data
- Convert uploaded images to LangChain4j’s
ImageContentfor multimodal processing - Create a
CarImageAnalysisAgentthat analyzes car images and enriches rental feedback - Integrate the new agent at the beginning of the existing
CarProcessingWorkflowsequence - Understand how
ImageContentflows through agent parameters using@UserMessage - Understand how optional agents can be used to handle the absence of an input and skip the work of the agent
- See how the agent gracefully handles the absence of an image, returning the feedback unchanged
Understanding Multimodal Agents
What is Multimodal Processing?
Multimodal processing allows an AI agent to work with multiple types of content simultaneously — in this case, text and images. Instead of just reading feedback like “the car has some damage”, the agent can also see the car and identify specific issues.
How LangChain4j Handles Images
LangChain4j provides the ImageContent class to represent image data in messages sent to the LLM:
ImageContent: Wraps an image (as base64-encoded data with a MIME type) as a content part- When passed as a method parameter annotated with
@UserMessage, it is automatically included alongside text in the message sent to the LLM - The LLM receives both the text prompt and the image, enabling visual reasoning
The Enrichment Pattern
Rather than creating a separate “image analysis” output, the CarImageAnalysisAgent uses an enrichment pattern:
- Receives the original rental feedback text and an optional car image
- If an image is present, analyzes it and appends visual observations to the feedback
- If no image is present, returns the feedback unchanged
- The enriched feedback then flows into the existing
FeedbackAnalysisWorkflow— no downstream changes needed
This is elegant because it preserves the existing workflow structure while adding new capabilities.
Why ImageContent Stays Separate:
ImageContent is passed as a separate parameter alongside the String feedback:
ImageContentis a special LangChain4j type for multimodal AI, not simple text data- It’s only used by the image analysis agent, not by other agents in the workflow
- Keeping it separate maintains the clean separation between feedback text and multimodal content
What Are We Going to Build?
We’re enhancing the car management system with multimodal image analysis:
- Update the UI: Add an image upload field for rented cars in the Fleet Status grid
- Update the REST endpoint: Accept multipart form data with an optional image
- Convert to
ImageContent: Transform the uploaded file into a LangChain4jImageContent - Create
CarImageAnalysisAgent: A new agent that analyzes car images - Update the workflow: Insert the new agent at the beginning of the sequence
The Updated Architecture:
graph TB
Start([Car Return with optional image]) --> A[CarProcessingWorkflow<br/>Sequential]
A --> IMG[Step 1: CarImageAnalysisAgent<br/>Image Analysis]
IMG -->|enriched rentalFeedback| B[Step 2: FeedbackAnalysisWorkflow<br/>Parallel Mapper]
B --> B1[FeedbackTask.cleaning()]
B --> B2[FeedbackTask.maintenance()]
B --> B3[FeedbackTask.disposition()]
B1 --> BA[FeedbackAnalysisAgent]
B2 --> BA
B3 --> BA
BA --> BEnd[FeedbackAnalysisResults]
BEnd --> C[Step 3: FleetSupervisorAgent<br/>Autonomous Orchestration]
C --> CEnd[Supervisor Decision]
CEnd --> D[Step 4: CarConditionFeedbackAgent<br/>Final Summary]
D --> End([Updated Car])
style A fill:#90EE90
style IMG fill:#E8B4F8
style B fill:#87CEEB
style C fill:#FFB6C1
style D fill:#90EE90
style Start fill:#E8E8E8
style End fill:#E8E8E8
The Key Innovation:
The CarImageAnalysisAgent sits at the beginning of the sequence, before the FeedbackAnalysisWorkflow. Its output key is rentalFeedback, which means it replaces the original rental feedback in the agentic scope with the enriched version. All downstream agents automatically receive the enriched feedback without any code changes.
Prerequisites
Before starting:
- Completed Step 05 — This step builds on Step 5’s architecture
- Application from Step 05 is stopped (Ctrl+C)
- Understanding of the existing
CarProcessingWorkflowsequence
Part 1: Update the UI for Image Upload
Update the JavaScript
The action cell for all actionable cars in populateFleetStatusTable now includes a file input for optional image upload:
if (car.status === 'RENTED' || car.status === 'AT_CLEANING' || car.status === 'IN_MAINTENANCE') {
actionCell = `
<td>
<form onsubmit="processFeedback(event, ${car.id}, '${car.status}')">
<input type="file" id="car-image-${car.id}" accept="image/*">
<input type="text" class="feedback-input" id="feedback-${car.id}" placeholder="Enter feedback">
<button type="submit" class="return-button">Return</button>
</form>
</td>`;
}
The processFeedback function is updated to send a FormData object (multipart) instead of a simple query parameter, and now uses a single consolidated endpoint for all car returns:
const imageInput = document.getElementById(`car-image-${carId}`);
const formData = new FormData();
formData.append('feedback', feedback);
if (imageInput && imageInput.files.length > 0) {
formData.append('carImage', imageInput.files[0]);
}
fetch(`/car-management/return/${carId}`, {
method: 'POST',
body: formData
})
Key Points:
- Uses
FormDatafor multipart encoding — all statuses use the same endpoint and format - The image is only appended if the user selected a file
- No
Content-Typeheader is set — the browser automatically addsmultipart/form-datawith the correct boundary
Part 2: Update the REST Endpoint
Accept Multipart Form Data
Update src/main/java/com/carmanagement/resource/CarManagementResource.java to accept the image as a FileUpload and convert it to ImageContent:
package com.carmanagement.resource;
import jakarta.inject.Inject;
import jakarta.ws.rs.Consumes;
import jakarta.ws.rs.GET;
import jakarta.ws.rs.POST;
import jakarta.ws.rs.Path;
import jakarta.ws.rs.Produces;
import jakarta.ws.rs.core.MediaType;
import jakarta.ws.rs.core.Response;
import java.io.IOException;
import java.nio.file.Files;
import java.util.Base64;
import org.jboss.resteasy.reactive.RestForm;
import org.jboss.resteasy.reactive.multipart.FileUpload;
import dev.langchain4j.data.message.ImageContent;
import io.quarkus.logging.Log;
import io.smallrye.common.annotation.Blocking;
import io.smallrye.mutiny.Uni;
import com.carmanagement.service.CarManagementService;
/**
* REST resource for car management operations.
* Uses blocking processing for AI agent workflows.
*/
@Path("/car-management")
public class CarManagementResource {
@Inject
CarManagementService carManagementService;
/**
* Process a car return from any status (rental, cleaning, or maintenance).
* This is a blocking operation due to AI agent processing.
*
* @param carNumber The car number
* @param feedback Optional feedback about the return
* @param carImage Optional image of the car being returned (multipart form data)
* @return Uni that completes with the result
*/
@POST
@Path("/return/{carNumber}")
@Consumes(MediaType.MULTIPART_FORM_DATA)
@Blocking
public Uni<Response> processReturn(Integer carNumber, @RestForm String feedback, @RestForm FileUpload carImage) {
ImageContent imageContent = toImageContent(carImage);
return carManagementService.processCarReturn(carNumber, feedback != null ? feedback : "", imageContent)
.onItem().transform(result -> Response.ok(result).build())
.onFailure().recoverWithItem(e -> {
Log.error(e.getMessage(), e);
return Response.status(Response.Status.INTERNAL_SERVER_ERROR)
.entity("Error processing car return: " + e.getMessage())
.build();
});
}
@GET
@Path("/report")
@Produces(MediaType.TEXT_HTML)
public Response report() {
return Response.ok(carManagementService.report()).build();
}
private ImageContent toImageContent(FileUpload fileUpload) {
if (fileUpload == null || fileUpload.filePath() == null) {
return null;
}
try {
byte[] bytes = Files.readAllBytes(fileUpload.filePath());
String base64 = Base64.getEncoder().encodeToString(bytes);
String mimeType = fileUpload.contentType();
return new ImageContent(base64, mimeType);
} catch (IOException e) {
Log.error("Failed to read uploaded car image", e);
return null;
}
}
}
Let’s break it down:
@Consumes(MediaType.MULTIPART_FORM_DATA)
The consolidated return endpoint now consumes multipart form data instead of query parameters, and routes feedback based on the car’s current status:
@POST
@Path("/return/{carNumber}")
@Consumes(MediaType.MULTIPART_FORM_DATA)
@Blocking
public Uni<Response> processReturn(Integer carNumber,
@RestForm String feedback, @RestForm FileUpload carImage) {
@RestForm: Extracts form fields from the multipart requestFileUpload: RESTEasy Reactive’s type for handling uploaded files- The endpoint looks up the car’s status and routes the feedback to the appropriate parameter
The toImageContent Helper
private ImageContent toImageContent(FileUpload fileUpload) {
if (fileUpload == null || fileUpload.filePath() == null) {
return null;
}
try {
byte[] bytes = Files.readAllBytes(fileUpload.filePath());
String base64 = Base64.getEncoder().encodeToString(bytes);
String mimeType = fileUpload.contentType();
return new ImageContent(base64, mimeType);
} catch (IOException e) {
Log.error("Failed to read uploaded car image", e);
return null;
}
}
- Reads the uploaded file and converts it to base64-encoded data
- Creates an
ImageContentwith the base64 data and the file’s MIME type (e.g.,image/jpeg,image/png) - Falls back to
nullwhen no image is provided
Part 3: Pass the Image Through the Service Layer
Update src/main/java/com/carmanagement/service/CarManagementService
Add ImageContent as a parameter and forward it to the workflow:
package com.carmanagement.service;
import jakarta.enterprise.context.ApplicationScoped;
import jakarta.inject.Inject;
import jakarta.transaction.Transactional;
import com.carmanagement.agentic.workflow.CarProcessingWorkflow;
import com.carmanagement.model.CarConditions;
import com.carmanagement.model.CarInfo;
import com.carmanagement.model.CarStatus;
import com.carmanagement.model.FeedbackTask;
import dev.langchain4j.data.message.ImageContent;
import io.quarkus.logging.Log;
import io.smallrye.mutiny.Uni;
import java.util.List;
import static dev.langchain4j.agentic.observability.HtmlReportGenerator.generateReport;
/**
* Service for managing car returns from various operations.
* Uses async processing to handle Human-in-the-Loop workflow pauses.
*/
@ApplicationScoped
public class CarManagementService {
@Inject
CarProcessingWorkflow carProcessingWorkflow;
/**
* Process a car return from any operation.
* This method runs asynchronously to handle workflow pauses for human approval.
*
* @param carNumber The car number
* @param feedback Optional feedback
* @param carImage Optional image of the car
* @return Uni that completes with the result of the processing
*/
public Uni<String> processCarReturn(Integer carNumber, String feedback, ImageContent carImage) {
return Uni.createFrom().item(() -> {
CarInfo carInfo = findCarInfo(carNumber);
if (carInfo == null) {
return "Car not found with number: " + carNumber;
}
// Create the list of feedback tasks for parallel analysis
List<FeedbackTask> tasks = List.of(
FeedbackTask.cleaning(),
FeedbackTask.maintenance(),
FeedbackTask.disposition()
);
// Process the car return using the workflow with supervisor
// This may PAUSE if human approval is needed
CarConditions carConditions = carProcessingWorkflow.processCarReturn(
tasks,
carInfo,
carNumber,
feedback,
carImage);
Log.info("CarConditionFeedbackAgent updating...");
// Update the car's condition with the result from CarConditionFeedbackAgent
carInfo.condition = carConditions.generalCondition();
// Update the car status based on the required action
switch (carConditions.carAssignment()) {
case DISPOSITION:
carInfo.status = CarStatus.PENDING_DISPOSITION;
Log.info("Car marked for disposition - awaiting final decision");
break;
case MAINTENANCE:
carInfo.status = CarStatus.IN_MAINTENANCE;
break;
case CLEANING:
carInfo.status = CarStatus.AT_CLEANING;
break;
case NONE:
carInfo.status = CarStatus.AVAILABLE;
break;
}
// Persist the changes to the database in a separate transaction
updateCarInfo(carInfo);
return carConditions.generalCondition();
}).runSubscriptionOn(io.smallrye.mutiny.infrastructure.Infrastructure.getDefaultWorkerPool());
}
/**
* Find car info in a read-only transaction
*/
@Transactional(Transactional.TxType.REQUIRES_NEW)
CarInfo findCarInfo(Integer carNumber) {
return CarInfo.findById(carNumber);
}
/**
* Update car info in a separate transaction after workflow completes.
* Uses merge to handle detached entity from the workflow.
*/
@Transactional(Transactional.TxType.REQUIRES_NEW)
void updateCarInfo(CarInfo carInfo) {
// Merge the detached entity back into the persistence context
CarInfo.getEntityManager().merge(carInfo);
}
public String report() {
return generateReport(carProcessingWorkflow.agentMonitor());
}
}
The image is passed straight through to the workflow alongside the feedback string and the carImage parameter:
CarConditions carConditions = carProcessingWorkflow.processCarReturn(
tasks,
carInfo,
carNumber,
feedback,
carImage);
Part 4: Create the CarImageAnalysisAgent
This is the core of this step — a new agent that processes car images.
Create src/main/java/com/carmanagement/agentic/agents/CarImageAnalysisAgent.java:
package com.carmanagement.agentic.agents;
import dev.langchain4j.agentic.Agent;
import dev.langchain4j.data.message.ImageContent;
import dev.langchain4j.service.SystemMessage;
import dev.langchain4j.service.UserMessage;
import dev.langchain4j.service.V;
/**
* Agent that analyzes a car image and enriches the rental feedback with visual observations.
* If no image is provided, the rental feedback is returned unchanged.
*/
public interface CarImageAnalysisAgent {
@SystemMessage("""
You are a car image analyst for a car rental company.
You will receive the current rental feedback for a car being returned.
If an image of the car is provided, analyze it and rewrite the rental feedback taking count of
your visual observations about the car's condition (e.g., visible damage, scratches, dents,
cleanliness issues, tire condition, etc.).
Avoid appending your visual observations in a separated section of the response, but combine
the existing rental feedback, if present, with what you can see from the image in a single response.
If no image is provided, or the image is empty or it doesn't seem related to a car,
simply return the rental feedback exactly as it is, without any modification.
Your response must always include the original rental feedback text followed by your observations if any.
In any cases the returned response MUST be a single sentence.
""")
@UserMessage("""
Feedback: {feedback}
""")
@Agent(description = "Car image analyzer. Enriches rental feedback with visual observations from a car image.",
outputKey = "feedback", optional = true)
String analyzeCarImage(String feedback, @UserMessage @V("carImage") ImageContent carImage);
}
Let’s break it down:
The @SystemMessage
@SystemMessage("""
You are a car image analyst for a car rental company.
You will receive the current rental feedback for a car being returned.
If an image of the car is provided, analyze it and enrich the rental feedback by appending
your visual observations about the car's condition (e.g., visible damage, scratches, dents,
cleanliness issues, tire condition, etc.).
If no image is provided, return the rental feedback exactly as it is, without any modification.
Your response must always include the original rental feedback text followed by your observations if any.
""")
The system message instructs the LLM to:
- Analyze the image if one is provided, looking for visible damage, cleanliness issues, etc.
- Preserve the original feedback — always include it in the response
- Be a no-op when there’s no image — return the feedback unchanged
The @UserMessage and ImageContent Parameter
@UserMessage("""
Rental Feedback: {rentalFeedback}
""")
String analyzeCarImage(String rentalFeedback, @UserMessage @V("carImage") ImageContent carImage);
Note that the @UserMessage annotation on the ImageContent parameter tells LangChain4j to include the image as an additional content part in the user message sent to the LLM. That is a particular usage of the @UserMessage annotation that is specific for multimodal content. The LLM receives both the text template and the image simultaneously, enabling multimodal reasoning. In this case we also need to add the @V annotation to specify the variable name in the template of the UserMessage.
The outputKey and the optional flag
@Agent(description = "Car image analyzer. Enriches rental feedback with visual observations from a car image.",
outputKey = "rentalFeedback", optional = true)
The agent’s output key is rentalFeedback, which means its result replaces the rentalFeedback value in the agentic scope. All subsequent agents in the workflow (FeedbackWorkflow, FleetSupervisorAgent, etc.) will automatically receive the enriched feedback. The optional flag is set to true to allow to entirely skip the invocation of an agent if not all of its required parameters are provided; in this case it will be skipped if the image is missing.
Part 5: Update the Workflow
Add the Agent to the Sequence
Update CarProcessingWorkflow.java to include CarImageAnalysisAgent as the first sub-agent and add the ImageContent parameter:
package com.carmanagement.agentic.workflow;
import com.carmanagement.agentic.agents.CarConditionFeedbackAgent;
import com.carmanagement.agentic.agents.CarImageAnalysisAgent;
import com.carmanagement.agentic.agents.FleetSupervisorAgent;
import com.carmanagement.model.CarConditions;
import com.carmanagement.model.CarInfo;
import com.carmanagement.model.FeedbackTask;
import dev.langchain4j.agentic.declarative.Output;
import dev.langchain4j.agentic.declarative.SequenceAgent;
import dev.langchain4j.agentic.observability.MonitoredAgent;
import dev.langchain4j.data.message.ImageContent;
import io.quarkus.logging.Log;
import java.util.List;
/**
* Workflow for processing car returns using a supervisor agent for complete orchestration.
* The supervisor coordinates both feedback analysis and action agents.
*/
public interface CarProcessingWorkflow extends MonitoredAgent {
/**
* Processes a car return by first analyzing feedback, then using supervisor to coordinate actions.
* CarImageAnalysisAgent analyzes the car image first.
* FeedbackAnalysisWorkflow analyzes feedback in parallel and returns FeedbackAnalysisResults via its @Output method.
* FleetSupervisorAgent uses these results to coordinate action agents.
* CarConditionFeedbackAgent determines the final car assignment and condition.
*/
@SequenceAgent(outputKey = "carProcessingAgentResult",
subAgents = { CarImageAnalysisAgent.class, FeedbackAnalysisWorkflow.class, FleetSupervisorAgent.class, CarConditionFeedbackAgent.class })
CarConditions processCarReturn(
List<FeedbackTask> tasks,
CarInfo carInfo,
Integer carNumber,
String feedback,
ImageContent carImage);
@Output
static CarConditions output(CarConditions carConditions) {
// CarConditionFeedbackAgent now handles all the logic for determining
// the final car assignment, disposition status, and condition description.
// We simply pass through its result.
Log.debug("DEBUG CarConditions output method:");
Log.debug(" generalCondition: " + carConditions.generalCondition());
Log.debug(" carAssignment: " + carConditions.carAssignment());
Log.debug(" dispositionStatus: " + carConditions.dispositionStatus());
Log.debug(" dispositionReason: " + carConditions.dispositionReason());
return carConditions;
}
}
Key Changes:
CarImageAnalysisAgent.classis added as the first sub-agent in the@SequenceAgent- The sequence is now:
CarImageAnalysisAgent→FeedbackAnalysisWorkflow→FleetSupervisorAgent→CarConditionFeedbackAgent ImageContent carImageis added as a new parameter toprocessCarReturn
The flow is:
CarImageAnalysisAgentanalyzes the image and enrichesrentalFeedbackin the scopeFeedbackAnalysisWorkflowreceives the enrichedrentalFeedbackand runs parallel analysis- The rest of the workflow proceeds as before
Try It Out
Start the Application
- Navigate to the step-06 directory:
- Start the application:
Test Without an Image
Find the Honda Civic (status: Rented) in the Fleet Status grid and enter feedback without uploading an image:
Click Return.
Expected Result:
- The
CarImageAnalysisAgentreceives the feedback with an empty image - Since there’s no meaningful image, it returns the feedback unchanged
- The rest of the workflow processes the original feedback as before
Test With an Image
- Find or take a photo of a car (there is a sample image named
q4-tree.pngin theresourcesfolder, but any car photo will work) - In the Fleet Status grid, find the car and click “Choose File” in its Action column
- Select the image
- Enter some feedback:
- Click Return
Expected Result:
- The
CarImageAnalysisAgentanalyzes the image alongside the feedback - It enriches the feedback with visual observations, e.g.: “Customer mentioned a minor scratch. Visual analysis: The image shows a visible scratch on the front left fender, approximately 15cm long. The paint is chipped in the affected area. Additionally, the front bumper shows minor scuff marks on the lower right corner.”
- The enriched feedback flows into
FeedbackAnalysisWorkflow, which may now detect cleaning, maintenance, or disposition needs that the original text alone wouldn’t have triggered
Check the Agent Report
Click Generate Report to see the execution trace. You’ll see the CarImageAnalysisAgent as the first step in the sequence, with its input (original feedback) and output (enriched feedback).
How It All Works Together
sequenceDiagram
participant User
participant UI as Web UI
participant REST as CarManagementResource
participant Service as CarManagementService
participant Workflow as CarProcessingWorkflow
participant ImageAgent as CarImageAnalysisAgent
participant FeedbackWF as FeedbackAnalysisWorkflow
User->>UI: Enter feedback + upload image
UI->>REST: POST multipart (feedback + image)
REST->>REST: toImageContent(fileUpload)
REST->>Service: processCarReturn(..., imageContent)
Service->>Workflow: processCarReturn(..., carImage)
rect rgb(232, 180, 248)
Note over Workflow,ImageAgent: Image Analysis (Step 1)
Workflow->>ImageAgent: analyzeCarImage(rentalFeedback, carImage)
ImageAgent->>ImageAgent: LLM analyzes text + image
ImageAgent->>Workflow: enriched rentalFeedback
end
rect rgb(255, 243, 205)
Note over Workflow,FeedbackWF: Parallel Analysis (Step 2)
Workflow->>FeedbackWF: Uses enriched rentalFeedback
par Concurrent Execution
FeedbackWF->>FeedbackWF: FeedbackAnalysisAgent<br/>with FeedbackTask.cleaning()
and
FeedbackWF->>FeedbackWF: FeedbackAnalysisAgent<br/>with FeedbackTask.maintenance()
and
FeedbackWF->>FeedbackWF: FeedbackAnalysisAgent<br/>with FeedbackTask.disposition()
end
end
Note over Workflow: Steps 3-4: Supervisor + Condition (unchanged)
Key Takeaways
- Multimodal agents can process both text and images in a single interaction
ImageContentis LangChain4j’s way to represent images for LLM consumption@UserMessageonImageContentparameters automatically includes the image in the message to the LLM- The enrichment pattern (outputKey matching an existing scope variable) allows new agents to augment data without changing downstream code
- Optional agent: The agent can be skipped if no image is provided
- Multipart form data with
@RestForm FileUploadmakes image upload straightforward in Quarkus - Base64 encoding is used to convert uploaded files into
ImageContent
Experiment Further
1. Try Different Image Types
Upload various car images to see how the agent describes different conditions:
- A clean, well-maintained car
- A car with visible damage (dents, scratches)
- A dirty car (mud, stains)
- An interior shot showing wear
2. Compare With and Without Images
Return the same car with identical text feedback but with and without an image. Compare how the downstream agents (cleaning, maintenance, disposition) react differently based on the enriched feedback.
3. Adjust the System Message
Modify the CarImageAnalysisAgent’s system message to focus on specific aspects:
- Only report safety-critical damage
- Include estimated repair costs
- Rate the car’s cleanliness on a scale of 1-10
Troubleshooting
Image not being processed
Verify that:
- The file input has
accept="image/*"to filter non-image files - The JavaScript correctly appends the file to
FormData - The
toImageContentmethod is reading the file and encoding it as base64 - Check the server logs for any
IOExceptionmessages
Agent returns feedback unchanged even with an image
This can happen if:
- The image is too small or blank (the LLM sees nothing to analyze)
- The MIME type is incorrect — verify
fileUpload.contentType()returns a valid image type - The LLM model doesn’t support vision — ensure your configured model supports multimodal input
Request too large
Large images (>10MB) may exceed request size limits. Consider:
- Adding
accept="image/*"to the file input (already done) - Configuring
quarkus.http.body.max-body-sizeinapplication.propertiesif needed - Compressing images client-side before upload
What’s Next?
You’ve successfully added multimodal image analysis to the car management system!
The system now:
- Accepts optional car images during rental returns
- Analyzes images using a multimodal LLM agent
- Enriches rental feedback with visual observations
- Seamlessly integrates with the existing workflow — no downstream changes needed
Key Progression: - Step 4: Sophisticated local orchestration with Supervisor Pattern - Step 5: Human-in-the-Loop for safe, controlled autonomous decisions - Step 6: Multimodal image analysis for enriched feedback
In Step 07, you’ll learn about Agent-to-Agent (A2A) communication — converting the local PricingAgent into a remote service that runs in a separate system, demonstrating how to distribute agent workloads across multiple applications!