Quarkus Insights #247: Agentic Orchestration with LangChain4j and Kubernetes
This summary was generated using AI, reviewed by humans - watch the video for the full story.
Quarkus Insights #247: Agentic Orchestration with LangChain4j and Kubernetes
In episode 247 of Quarkus Insights, Jonathan Johnson, an independent software architect and adjunct professor at Trinity College (Connecticut), shared his innovative work on distributed agentic AI systems running on Kubernetes with Quarkus and LangChain4j.
The Journey: From Spring Boot to Quarkus
Jonathan’s journey began with a home lab project using Spring Boot and Spring AI. After attending a Connecticut Java Users Group talk by Eric Deandrea (from the Quarkus team), he was inspired to explore LangChain4j with Quarkus. Using Claude AI for code conversion, he successfully migrated his Spring Boot application to Quarkus with LangChain4j in approximately one hour.
Key migration benefits observed: - Faster startup times - Reduced memory footprint - Lower CPU usage - Native compilation with GraalVM support
The Home Lab Setup
Jonathan built an impressive home lab infrastructure to support his distributed AI experiments:
-
Three custom-built rigs running Proxmox as the hypervisor
-
GPU resources: NVIDIA 4090 (24GB) and 5090 (32GB) for local model inference
-
Total memory: 480GB RAM across the cluster
-
Storage: 70TB NAS using TrueNAS
-
Kubernetes: Vanilla Kubernetes deployed via kubeadm
-
Infrastructure as Code: OpenTofu scripts for reproducible deployments
The entire setup runs Ollama on Kubernetes with GPU passthrough, enabling local LLM inference without cloud dependencies.
Distributed Agentic Architecture: A Different Approach
Rather than using traditional LangChain4j annotations for sequential or parallel agent pipelines within a single application, Jonathan implemented a distributed round-table architecture using NATS messaging:
Traditional Approach (Monolithic)
@SequenceAgent
@ParallelAgent
@SupervisorAgent
// All agents in one compiled application
Distributed Approach (Kubemoot)
-
Agents as Kubernetes pods: Each agent runs as an independent container
-
NATS messaging: Agents communicate via distributed messaging (not inter-process)
-
Democratic coordination: Agents self-select to contribute based on their expertise
-
Round-table pattern: Similar to Reddit threads where experts chime in
Architecture Components
-
Coordinator Agent: Interprets incoming questions and broadcasts to the agent pool
-
Specialist Agents: Each has domain expertise (Kubernetes, Proxmox, SQL, etc.)
-
MCP Tools: Agents have access to Model Context Protocol tools
-
Prompts as ConfigMaps: Stored as Kubernetes YAML manifests, not compiled code
-
Kubemoot Operator: Custom Kubernetes operator managing the agentic system
Key Technical Insights
Prompt Engineering with ADL
Jonathan adopted the Architecture Definition Language (ADL) from Mark Richards and Neal Ford for more structured prompting:
-
Moved from informal English prose to semi-formal ADL specifications
-
Observed significant improvements in response quality and speed
-
Better agent comprehension compared to unstructured prompts
Performance Considerations
Challenges with distributed agents: - Initial democratic approach was slow (15+ minutes for simple queries) - Each agent loading context independently created overhead - Trade-off between flexibility and performance
Optimizations: - Selective agent participation based on question relevance - Agents can "bow out" if not relevant to the query - Running entirely on local GPUs (no cloud API costs)
Development Workflow
Jonathan emphasized modern development practices:
-
Vibe coding with Claude: Rapid prototyping and code generation
-
Kanban boards in Obsidian: Task tracking for AI-assisted development
-
GitOps: All infrastructure changes via OpenTofu and Git commits
-
CI/CD: GitHub Actions + Flux for automated deployments
-
No imperative kubectl commands: Everything through infrastructure as code
Lessons Learned
AI-Assisted Development
-
Voice-to-text tools (like Hex) reduce typing fatigue during long AI sessions
-
Maintain discipline: avoid letting AI make imperative cluster changes
-
Use Kanban boards to track AI-generated tasks and prevent context loss
-
Set clear boundaries in prompts (e.g., "always use GitOps")
Future Directions
Jonathan plans to: - Open source the Kubemoot project (summer 2026) - Add hybrid support for both local and cloud-based LLMs - Explore integration with LangChain4j’s planner interface - Present workshops at the dev2next conference (October 2026)
Resources
-
O’Reilly Training: Jonathan offers live training on Kubernetes and agentic systems
-
Connecticut Java Users Group: Regular meetups on Java and cloud-native topics
-
Upcoming talks: dev2next conference, Longmont, Colorado (October 2026)
Takeaways for Developers
-
Quarkus + LangChain4j provides excellent performance for AI applications with fast startup and low resource usage
-
Distributed agentic patterns offer flexibility but require careful design for performance
-
Local LLM inference is viable for privacy-sensitive applications with proper hardware
-
Infrastructure as Code is essential, especially when working with AI-assisted development
-
Prompt engineering benefits from structured approaches like ADL
This episode demonstrates that cloud-native patterns and distributed systems thinking can be successfully applied to agentic AI architectures, opening new possibilities beyond traditional monolithic agent pipelines.