Architecture

AgentCrew is composed of several interconnected systems that work together to orchestrate multi-agent AI teams. This page describes each component, how they communicate, and how the runtime environments are structured.

System Overview

The high-level architecture follows a message-driven design where the frontend, API, and agent containers communicate through NATS:

┌─────────────────────────────────────────────────────────────────┐
│                        Host Machine                             │
│                                                                 │
│  ┌──────────────┐    ┌──────────────────┐    ┌───────────────┐  │
│  │   Frontend    │    │    API Server     │    │     NATS      │  │
│  │  (React SPA)  │───▶│   (Go / Fiber)   │───▶│   (Messaging) │  │
│  │  :8080        │    │   :3000           │    │   :4222       │  │
│  └──────────────┘    └──────────────────┘    └───────┬───────┘  │
│                                                       │         │
│                              ┌────────────────────────┘         │
│                              ▼                                  │
│                    ┌───────────────────┐                        │
│                    │  Agent Container   │                        │
│                    │  ┌─────────────┐  │                        │
│                    │  │   Sidecar    │  │                        │
│                    │  │  (NATS ↔    │  │                        │
│                    │  │  stdin/out)  │  │                        │
│                    │  └──────┬──────┘  │                        │
│                    │         │         │                        │
│                    │  ┌──────▼──────┐  │                        │
│                    │  │ AI Provider  │  │                        │
│                    │  │    CLI       │  │                        │
│                    │  └─────────────┘  │                        │
│                    │                   │                        │
│                    │  /workspace ──────┼─── Host directory      │
│                    └───────────────────┘    or Docker volume    │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Components

Frontend (React SPA)

The frontend is a single-page application built with React. It provides the user interface for managing teams, configuring agents, installing skills, and chatting with teams in real time. It communicates with the API server via HTTP endpoints and WebSocket connections for live message streaming.

API Server (Go / Fiber)

The API server is built with Go using the Fiber framework. It handles:

Authentication and authorization: Built-in auth layer with JWT tokens, user management, organizations, and role-based access control (admin/member). See Authentication.
REST API endpoints for CRUD operations on teams, agents, and skills.
Application settings management (API keys, configuration).
Docker runtime management, including creating networks, volumes, and containers for each team.
NATS message routing between the frontend and agent containers.
Multi-tenant data isolation: all queries are scoped to the user's organization.
SQLite database access for persistent storage.

NATS (Messaging)

NATS provides the real-time messaging layer. Each team gets its own NATS instance running in a container. Messages flow bidirectionally:

User → Agent: Chat messages are published to NATS by the API server and received by the agent's sidecar process.
Agent → User: Agent responses are published to NATS by the sidecar and forwarded to the frontend through the API server.

NATS authentication is handled via the NATS_AUTH_TOKEN environment variable, ensuring only authorized components can connect.

Agent Container Internals

Each team's leader runs inside a Docker container based on a provider-specific image (agent_crew_agent for Claude Code, agent_crew_opencode_agent for OpenCode). The container includes several components:

Sidecar Process

The sidecar is a lightweight process that bridges NATS messages to the AI provider's interface. It:

Subscribes to the team's NATS subject for incoming messages.
Forwards incoming messages to the AI provider (stdin for Claude Code, HTTP API for OpenCode).
Reads the provider's output and publishes responses back to NATS.

This design keeps the AI provider unaware of the messaging infrastructure. The sidecar handles all network communication, abstracting the differences between providers behind a unified interface.

Skills CLI

Before the AI provider starts, the container runs the skills installation step. It iterates over the configured skills for the team and installs each one using the npx skills add command. Skills are placed in .agents/skills/ with symlinks created in .claude/skills/.

Workspace

The /workspace directory inside the container is either:

A bind mount from a host directory (when a workspace path is configured), or
A Docker volume (when no workspace path is specified).

This is where agents read and write project files. The .claude/ directory within the workspace contains the leader instructions, agent definitions, and skills.

.claude/ Directory Structure

/workspace/.claude/
  CLAUDE.md              # Leader agent instructions and team context
  agents/
    frontend-dev.md      # Worker: Frontend Developer definition
    backend-dev.md       # Worker: Backend Developer definition
    devops-engineer.md   # Worker: DevOps Engineer definition
  skills/
    skill-name → ../../.agents/skills/skill-name   # Symlinks to installed skills

Leader vs Workers

Leader

The leader is the only agent that runs inside a container. It receives messages from the user, interprets the request, and coordinates work by delegating tasks to workers. The leader's instructions are defined in /workspace/.claude/CLAUDE.md.

Workers

Workers do not run in separate containers. Instead, they are defined as Markdown files in /workspace/.claude/agents/. Claude Code reads these files and spawns workers as sub-agents within the same process. Each worker's .md file contains its name, role, and detailed instructions.

This design keeps resource usage efficient. Only one container runs per team, regardless of how many workers are defined.

Docker Runtime

When a team is created, the API server provisions the following Docker resources:

Resource	Description
Network	An isolated Docker network for the team, connecting the NATS container and the agent container.
NATS Container	A NATS server instance dedicated to the team, configured with the shared auth token.
Workspace Volume	A Docker volume (or bind mount) for the `/workspace` directory.
Leader Container	The agent container running the sidecar process and the AI provider CLI.

All resources are namespaced by team ID to avoid conflicts between multiple running teams.

Kubernetes Runtime (Coming Soon)

A Kubernetes runtime is planned for production deployments. The design follows the same logical architecture with Kubernetes-native resources:

Namespace per team: Isolation between teams using Kubernetes namespaces.
PersistentVolumeClaims: For workspace storage and database persistence.
Pods: NATS and agent containers running as pods with appropriate resource limits.
Services: Internal networking between NATS and agent pods.

The Kubernetes runtime will support horizontal scaling, better resource management, and integration with existing cluster infrastructure.

Task Processing Model

Each agent team runs a single AI agent process inside its container. The agent handles one request at a time, in the order received.

FIFO Queue

When multiple messages arrive concurrently (e.g., two scheduled tasks firing at the same time, or a chat message while a schedule is running), the sidecar queues messages and sends them one at a time. The agent processes them in FIFO order: the first message in is the first message answered.

The sidecar maintains an internal correlation queue to match each response back to the correct request. This ensures that scheduled task A receives the response meant for task A, even if task B was sent moments later.

  Concurrent requests                    Sequential processing
  ┌──────────────────┐                   ┌─────────────────────┐
  │ Schedule A ──────┼──▶ stdin ──▶      │ Claude processes A  │
  │ Schedule B ──────┼──▶ (queued)       │ Claude processes B  │
  │ Chat message ────┼──▶ (queued)       │ Claude processes C  │
  └──────────────────┘                   └─────────────────────┘

  Correlation queue: [A, B, Chat]
  Response A → matched to Schedule A
  Response B → matched to Schedule B
  Response C → matched to Chat

Queue Implementation per Provider

Both providers achieve the same FIFO behavior, but the serialization mechanism differs due to how each CLI accepts input:

Aspect	Claude Code	OpenCode
Interface	stdin/stdout	HTTP REST + SSE
SendInput behavior	Blocking — the call writes to stdin and waits for the full response on stdout before returning.	Non-blocking — the HTTP POST to `/prompt_async` returns immediately.
Serialization	Natural: because the call blocks, the next message cannot be sent until the current one finishes.	Explicit queue: the sidecar maintains a `busy` flag and an in-memory pending queue. When a prompt is in flight, new messages are queued. When the SSE stream emits a `result` event, the next queued message is drained and sent.

In both cases, the bridge layer maintains a shared correlation queue (scheduledRunIDs) that maps each response back to the correct scheduled run, regardless of provider.

Why One Process per Team?

The AI provider maintains conversational context across messages. Running a single process per team means the agent retains awareness of previous interactions within the same session — a schedule can build on context from earlier messages. One process, one conversation thread.

Note: The scheduler engine itself can launch many executions in parallel (controlled by SCHEDULER_MAX_CONCURRENT), but within each team, messages are processed sequentially. If two schedules target the same team, the second waits for the first to finish.

Data Flow: Sending a Message

Here is the complete flow when a user sends a message to a team:

The user types a message in the frontend chat interface.
The frontend sends the message to the API server via HTTP/WebSocket.
The API server publishes the message to the team's NATS subject.
The sidecar process inside the agent container receives the NATS message.
The sidecar forwards the message to the AI provider.
The AI agent processes the request, potentially delegating to worker sub-agents.
The agent produces response output.
The sidecar reads the output and publishes response chunks to NATS.
The API server receives NATS messages and forwards them to the frontend.
The frontend renders the response in real time as chunks arrive.

Next Steps

Skills: Understand how skills extend agent capabilities and integrate into the container.
Configuration: Review all configuration options for customizing your deployment.
Quick Start: Get AgentCrew running locally in under 5 minutes.