Architecture
AgentCrew is composed of several interconnected systems that work together to orchestrate multi-agent AI teams. This page describes each component, how they communicate, and how the runtime environments are structured.
System Overview
The high-level architecture follows a message-driven design where the frontend, API, and agent containers communicate through NATS:
┌─────────────────────────────────────────────────────────────────┐
│ Host Machine │
│ │
│ ┌──────────────┐ ┌──────────────────┐ ┌───────────────┐ │
│ │ Frontend │ │ API Server │ │ NATS │ │
│ │ (React SPA) │───▶│ (Go / Fiber) │───▶│ (Messaging) │ │
│ │ :8080 │ │ :3000 │ │ :4222 │ │
│ └──────────────┘ └──────────────────┘ └───────┬───────┘ │
│ │ │
│ ┌────────────────────────┘ │
│ ▼ │
│ ┌───────────────────┐ │
│ │ Agent Container │ │
│ │ ┌─────────────┐ │ │
│ │ │ Sidecar │ │ │
│ │ │ (NATS ↔ │ │ │
│ │ │ stdin/out) │ │ │
│ │ └──────┬──────┘ │ │
│ │ │ │ │
│ │ ┌──────▼──────┐ │ │
│ │ │ AI Provider │ │ │
│ │ │ CLI │ │ │
│ │ └─────────────┘ │ │
│ │ │ │
│ │ /workspace ──────┼─── Host directory │
│ └───────────────────┘ or Docker volume │
│ │
└─────────────────────────────────────────────────────────────────┘ Components
Frontend (React SPA)
The frontend is a single-page application built with React. It provides the user interface for managing teams, configuring agents, installing skills, and chatting with teams in real time. It communicates with the API server via HTTP endpoints and WebSocket connections for live message streaming.
API Server (Go / Fiber)
The API server is built with Go using the Fiber framework. It handles:
- Authentication and authorization: Built-in auth layer with JWT tokens, user management, organizations, and role-based access control (admin/member). See Authentication.
- REST API endpoints for CRUD operations on teams, agents, and skills.
- Application settings management (API keys, configuration).
- Docker runtime management, including creating networks, volumes, and containers for each team.
- NATS message routing between the frontend and agent containers.
- Multi-tenant data isolation: all queries are scoped to the user's organization.
- SQLite database access for persistent storage.
NATS (Messaging)
NATS provides the real-time messaging layer. Each team gets its own NATS instance running in a container. Messages flow bidirectionally:
- User → Agent: Chat messages are published to NATS by the API server and received by the agent's sidecar process.
- Agent → User: Agent responses are published to NATS by the sidecar and forwarded to the frontend through the API server.
NATS authentication is handled via the NATS_AUTH_TOKEN
environment variable, ensuring only authorized components can connect.
Agent Container Internals
Each team's leader runs inside a Docker container based on a
provider-specific image (agent_crew_agent for Claude Code,
agent_crew_opencode_agent for OpenCode). The container
includes several components:
Sidecar Process
The sidecar is a lightweight process that bridges NATS messages to the AI provider's interface. It:
- Subscribes to the team's NATS subject for incoming messages.
- Forwards incoming messages to the AI provider (stdin for Claude Code, HTTP API for OpenCode).
- Reads the provider's output and publishes responses back to NATS.
This design keeps the AI provider unaware of the messaging infrastructure. The sidecar handles all network communication, abstracting the differences between providers behind a unified interface.
Skills CLI
Before the AI provider starts, the container runs the skills installation
step. It iterates over the configured skills for the team and installs
each one using the npx skills add command. Skills are
placed in .agents/skills/ with symlinks created in
.claude/skills/.
Workspace
The /workspace directory inside the container is either:
- A bind mount from a host directory (when a workspace path is configured), or
- A Docker volume (when no workspace path is specified).
This is where agents read and write project files. The
.claude/ directory within the workspace contains the leader
instructions, agent definitions, and skills.
.claude/ Directory Structure
/workspace/.claude/
CLAUDE.md # Leader agent instructions and team context
agents/
frontend-dev.md # Worker: Frontend Developer definition
backend-dev.md # Worker: Backend Developer definition
devops-engineer.md # Worker: DevOps Engineer definition
skills/
skill-name → ../../.agents/skills/skill-name # Symlinks to installed skills Leader vs Workers
Leader
The leader is the only agent that runs inside a container. It receives
messages from the user, interprets the request, and coordinates work by
delegating tasks to workers. The leader's instructions are defined in
/workspace/.claude/CLAUDE.md.
Workers
Workers do not run in separate containers. Instead, they
are defined as Markdown files in /workspace/.claude/agents/.
Claude Code reads these files and spawns workers as sub-agents within the
same process. Each worker's .md file contains its name,
role, and detailed instructions.
This design keeps resource usage efficient. Only one container runs per team, regardless of how many workers are defined.
Docker Runtime
When a team is created, the API server provisions the following Docker resources:
| Resource | Description |
|---|---|
| Network | An isolated Docker network for the team, connecting the NATS container and the agent container. |
| NATS Container | A NATS server instance dedicated to the team, configured with the shared auth token. |
| Workspace Volume | A Docker volume (or bind mount) for the /workspace directory. |
| Leader Container | The agent container running the sidecar process and the AI provider CLI. |
All resources are namespaced by team ID to avoid conflicts between multiple running teams.
Kubernetes Runtime (Coming Soon)
A Kubernetes runtime is planned for production deployments. The design follows the same logical architecture with Kubernetes-native resources:
- Namespace per team: Isolation between teams using Kubernetes namespaces.
- PersistentVolumeClaims: For workspace storage and database persistence.
- Pods: NATS and agent containers running as pods with appropriate resource limits.
- Services: Internal networking between NATS and agent pods.
The Kubernetes runtime will support horizontal scaling, better resource management, and integration with existing cluster infrastructure.
Task Processing Model
Each agent team runs a single AI agent process inside its container. The agent handles one request at a time, in the order received.
FIFO Queue
When multiple messages arrive concurrently (e.g., two scheduled tasks firing at the same time, or a chat message while a schedule is running), the sidecar queues messages and sends them one at a time. The agent processes them in FIFO order: the first message in is the first message answered.
The sidecar maintains an internal correlation queue to match each response back to the correct request. This ensures that scheduled task A receives the response meant for task A, even if task B was sent moments later.
Concurrent requests Sequential processing
┌──────────────────┐ ┌─────────────────────┐
│ Schedule A ──────┼──▶ stdin ──▶ │ Claude processes A │
│ Schedule B ──────┼──▶ (queued) │ Claude processes B │
│ Chat message ────┼──▶ (queued) │ Claude processes C │
└──────────────────┘ └─────────────────────┘
Correlation queue: [A, B, Chat]
Response A → matched to Schedule A
Response B → matched to Schedule B
Response C → matched to Chat Queue Implementation per Provider
Both providers achieve the same FIFO behavior, but the serialization mechanism differs due to how each CLI accepts input:
| Aspect | Claude Code | OpenCode |
|---|---|---|
| Interface | stdin/stdout | HTTP REST + SSE |
| SendInput behavior | Blocking — the call writes to stdin and waits for the full response on stdout before returning. | Non-blocking — the HTTP POST to /prompt_async returns immediately. |
| Serialization | Natural: because the call blocks, the next message cannot be sent until the current one finishes. | Explicit queue: the sidecar maintains a busy flag and an in-memory pending queue. When a prompt is in flight, new messages are queued. When the SSE stream emits a result event, the next queued message is drained and sent. |
In both cases, the bridge layer maintains a shared correlation queue
(scheduledRunIDs) that maps each response back to the
correct scheduled run, regardless of provider.
Why One Process per Team?
The AI provider maintains conversational context across messages. Running a single process per team means the agent retains awareness of previous interactions within the same session — a schedule can build on context from earlier messages. One process, one conversation thread.
Note: The scheduler engine itself can launch many
executions in parallel (controlled by
SCHEDULER_MAX_CONCURRENT), but within each team, messages
are processed sequentially. If two schedules target the same team, the
second waits for the first to finish.
Data Flow: Sending a Message
Here is the complete flow when a user sends a message to a team:
- The user types a message in the frontend chat interface.
- The frontend sends the message to the API server via HTTP/WebSocket.
- The API server publishes the message to the team's NATS subject.
- The sidecar process inside the agent container receives the NATS message.
- The sidecar forwards the message to the AI provider.
- The AI agent processes the request, potentially delegating to worker sub-agents.
- The agent produces response output.
- The sidecar reads the output and publishes response chunks to NATS.
- The API server receives NATS messages and forwards them to the frontend.
- The frontend renders the response in real time as chunks arrive.
Next Steps
- Skills: Understand how skills extend agent capabilities and integrate into the container.
- Configuration: Review all configuration options for customizing your deployment.
- Quick Start: Get AgentCrew running locally in under 5 minutes.