Every complex system in the real world delegates. A CEO does not write code, file taxes, and design the logo. The moment a system tries to do everything itself, it becomes the bottleneck for everything.
AI agents hit the same wall. A single agent trying to write code, review it, check types, run tests, analyze security, and optimize SEO in one context window is not a powerful agent. It is a confused one. Context gets polluted. Errors compound. Quality drops with every additional responsibility.
Subagents are how capable AI systems scale beyond the limits of a single context window.
If you are new to AI agents, start with our foundational guide: What Are AI Agents?. For a broader look at how agents are categorized, see Types of AI Agents.
What Are AI Subagents?
An AI subagent is a specialized agent spawned by a parent agent to perform a specific, scoped task, then return its results to the parent. The parent agent acts as an orchestrator -- it decomposes work, dispatches subagents, collects their outputs, and synthesizes the final result. Subagents do not operate independently. They exist within the lifecycle of a parent session and report exclusively to the agent that spawned them.
This is fundamentally different from a standalone agent, which owns its entire workflow end to end. A subagent owns one piece of a larger workflow that someone else is coordinating. The distinction is structural, not just semantic -- it determines how context is managed, how failures are handled, and how work gets parallelized.
Think of it as the difference between a freelancer and a specialist on a project team. The freelancer sets their own goals and delivers a complete product. The specialist receives a defined brief from a project lead, executes their portion, and hands the result back. Both are capable, but they operate in different organizational structures.
The Parent-Child Relationship
The relationship between a parent agent and its subagents follows a clear hierarchy.
The parent agent is responsible for planning. It reads the overall task, breaks it into discrete units of work, decides which subagent is best suited for each unit, dispatches them, and then integrates their results. The parent holds the big picture. It never loses sight of the overall goal, even while individual subagents are deep in the details of their assigned piece. The subagent is responsible for execution within a defined scope. It receives a task description, a set of allowed tools, and a model assignment. It does its work, produces a result, and returns it to the parent. A subagent does not communicate with other subagents directly. It does not make decisions about the broader project. It focuses entirely on its assigned task.This one-way communication pattern -- parent to child, child back to parent, never child to child -- is a deliberate design choice. It prevents the coordination overhead that explodes when every agent can talk to every other agent. In AI agent swarms, peer-to-peer communication is sometimes used for emergent behavior. Subagent architectures trade that flexibility for predictability and control.
The result is a tree structure. The parent sits at the root. Subagents are leaf nodes. Work flows down as task assignments and flows back up as results. This is simple to reason about, simple to debug, and simple to scale.
Scope Isolation: Why It Matters
The most important property of a well-designed subagent system is scope isolation. Each subagent operates in its own bounded context, separated from the work of other subagents running in parallel.
Without isolation, parallel agents working on the same codebase will step on each other. Agent A modifies a file. Agent B modifies the same file based on its original state. When both finish, one agent's work overwrites the other's. This is not a theoretical risk -- it is a guaranteed outcome when parallel work touches shared state without coordination.
The solution is workspace isolation. In software development, this typically means git worktrees -- lightweight, independent working copies of a repository that share the same underlying git objects but have completely separate working directories. Each subagent gets its own worktree. It can read, write, create, and delete files without affecting any other subagent's workspace. When the work is done, the parent merges results back using sequential rebasing.
This is the same principle behind process isolation in operating systems. Each process gets its own memory space so it cannot accidentally corrupt another's state. Subagent worktree isolation provides the same guarantee at the project level. Four subagents work simultaneously on different parts of a codebase without any of them knowing the others exist.
How Subagents Differ from Standalone Agents
The key differences between subagents and standalone agents come down to lifecycle, authority, and context.
Lifecycle. A standalone agent persists across sessions with its own memory and operational continuity. A subagent is ephemeral -- spawned for a task and terminated when complete. Authority. A standalone agent decides what to work on and when to escalate. A subagent receives its authority from its parent. Its scope, tools, and model are defined by the orchestrator. Context. A standalone agent manages its own context window. A subagent receives focused context from its parent -- just enough to do its job. This is an advantage: a subagent reviewing code for security vulnerabilities does not need to know about the deployment pipeline. Narrower context, sharper focus. Failure handling. When a standalone agent fails, it handles its own recovery. When a subagent fails, the parent decides what happens -- retry, skip, substitute, or escalate. The failure does not propagate to other subagents. One crash does not take down the others.Real-World Subagent Architecture: How Nevo Does It
Nevo is a self-improving AI agent system that uses subagent orchestration as its primary execution model. The main session -- powered by Opus -- acts as the orchestrator. It never writes code directly except for trivial one-liners. Instead, it delegates to 20 specialized subagents.
Quality pipeline subagents run in sequence after every code change:- typechecker (Haiku) -- Type checking. Fast, cheap, catches errors early.
- test-runner (Sonnet) -- Writes and runs tests. Mid-tier model for balance.
- linter (Haiku) -- Style checks. Does not need a large model.
- code-critic (Opus) -- Reviews code against a rigorous rubric.
- fresh-reviewer (Opus) -- Independent review with no prior context.
- quality-arbiter (Opus) -- Final approve or deny. Last gate before code ships.
- shopify-designer (Opus) -- Shopify themes, Liquid templating, web design.
- seo-specialist (Sonnet) -- Technical SEO, structured data, search optimization.
- content-writer (Opus) -- Website copy, blog posts, marketing materials.
- security-reviewer (Opus) -- Security-focused code review, vulnerability detection.
- ai-research-specialist (Sonnet) -- Ecosystem research and competitive analysis.
- incident-monitor (Sonnet) -- Detects error patterns from tool failures.
- incident-analyst (Opus) -- Root cause analysis. Automatically generates new rules.
- skill-writer (Opus) -- Creates reusable skills from successful patterns.
Each subagent is defined as a Markdown file with YAML frontmatter specifying its name, allowed tools, and model. The parent dispatches a task description and receives the result.
Model Routing
Not every task needs the most powerful model. Nevo routes subagents to different models based on task complexity:
- Haiku for fast, mechanical tasks (type checking, linting) -- low cost, high speed.
- Sonnet for balanced tasks (testing, SEO analysis, research) -- good reasoning at moderate cost.
- Opus for tasks requiring deep reasoning (code critique, security review, incident analysis) -- highest quality, higher cost.
This model routing means the system is not paying Opus prices for work that Haiku can handle perfectly well. The orchestrator makes the routing decision once when defining the subagent, and every invocation uses the appropriate model automatically.
Parallel Dispatch
Nevo dispatches up to four concurrent subagents for independent work. The orchestrator checks for file scope overlap before dispatching -- if two stories touch the same files, they run sequentially. If they touch different files, they run in parallel with worktree isolation.
After a parallel batch completes, results merge using sequential rebasing: Story A merges first, Story B rebases onto A, Story C rebases onto A+B. Clean, linear history regardless of how many agents worked in parallel.
Benefits of Subagent Architecture
Focused Context
A single agent holding an entire project in its context window will lose track of details. A subagent dedicated to one component can use its full context window to understand that component deeply. Narrower scope produces better results.
Parallel Execution
Sequential processing is the default: do step one, wait, do step two, wait. Subagent parallelism breaks this. Four subagents working simultaneously finish in roughly the time one agent takes for a single task.
Failure Isolation
When a subagent fails, the failure is contained. The parent decides how to handle it. The other subagents continue undisturbed. In a monolithic agent, one failure can corrupt the context for everything that follows.
Specialization
A subagent that only does security review gets better at security review. Its prompt, tools, and model are all optimized for that task. Specialization produces higher quality than a generalist agent trying to be adequate at everything.
Cost Efficiency
Model routing means you are not burning premium tokens on tasks that do not need them. A type-checking pass does not need a model that costs ten times more than the minimum. Expensive models are reserved for tasks that genuinely benefit from deep reasoning.
When to Use Subagents vs. Other Architectures
Subagents are not the only multi-agent pattern. Here is when they are the right choice and when to consider alternatives.
Use subagents when you have a clear hierarchy of work, where one agent can decompose tasks and others can execute them independently. Code review pipelines, content production workflows, and quality assurance chains are natural fits. Consider agent swarms when you need emergent behavior from peer-to-peer interaction, where no single agent has the full picture and the collective behavior of many agents produces better results than top-down orchestration. Consider standalone agents when the task is self-contained and does not benefit from decomposition. Not every problem needs to be broken into subtasks. Sometimes one capable agent with the right tools is the simplest and best solution.FAQ
What is the difference between a subagent and a regular AI agent?
A regular AI agent operates independently with its own goals, memory, and lifecycle. A subagent is spawned by a parent agent for a specific task within a larger workflow. It receives its scope, tools, and model from the parent and reports results back exclusively to the agent that created it. The structural difference: subagents exist within the lifecycle of a parent session rather than operating on their own.
How many subagents can run at the same time?
It depends on the orchestration system. Nevo dispatches up to four concurrent subagents with worktree isolation, though the runtime supports up to seven. The constraint is merge complexity and file scope isolation, not compute. Four concurrent subagents with clean separation is more reliable than seven with potential conflicts.
Do subagents share context with each other?
No. Each subagent operates in its own isolated context. They do not communicate directly -- all coordination flows through the parent agent. This prevents context pollution, avoids race conditions, and ensures one subagent's failure does not affect another's work.
How does model routing work for subagents?
Model routing assigns different AI models to subagents based on task complexity. Mechanical tasks like type checking use a fast, inexpensive model. Balanced tasks like testing use a mid-tier model. Deep reasoning tasks like security review use the most capable model. Routing is defined in the subagent's configuration and applies automatically on every invocation.
Nevo is a self-improving AI agent system that uses subagent orchestration to deliver professional-grade work through specialized, parallel, quality-gated execution. Learn more about how Nevo works at nevo.systems.