What Are Claude Code Subagents? (And Why Specialists Beat Generalists)

Claude Code subagents are specialized AI agents that the main Claude Code session delegates work to. Each subagent runs in its own context window with its own tool permissions and system prompt. The main session never sees the scratch work — just the conclusion. This isolation is what makes specialists reliably outperform a single generalist agent on complex, multi-step tasks.

With Anthropic's Dynamic Workflows research preview (shipped May 2026) supporting up to 1,000 concurrent subagents, the architecture is no longer experimental. Here is how it works, when to use it, and what ClaudeKit ships out of the box.

Why does one generalist agent hit a ceiling?

A single Claude session doing everything accumulates everything: half-finished plans, file reads, dead ends, and the entire conversation history from the moment you opened the terminal. By the second hour of a complex task, the context is polluted and output quality drops in step with it.

Three specific failure modes emerge:

Context pollution. Research notes, intermediate plans, and discarded drafts crowd out the actual task. The model spends attention budget re-reading its own mess.
Role drift. One agent reviewing code it wrote grades generously. It has an unconscious stake in the verdict being "looks fine."
Sequential bottleneck. Research, tests, and documentation queue behind each other when they could run in parallel.

Subagents solve all three at once. The delegated task runs in a fresh context, returns only its conclusion, and the main session never pays for the investigation — only the answer. Anthropic's own research confirms that context contamination is the leading cause of degraded output in long agentic sessions. The fix is not a bigger context window; it is fewer things inside the window you are using.

How is a subagent defined in Claude Code?

Mechanically, a subagent is a Markdown file in .claude/agents/ (project-scoped) or ~/.claude/agents/ (global). The frontmatter declares its identity; the body is its system prompt:

---
name: seo-auditor
description: Audits a page for SEO issues. Invoke after any content
  or metadata change to a public-facing page.
tools: Read, Grep, Glob, WebFetch
model: claude-sonnet-4-5
---
 
You are a senior SEO auditor. You did not write this page.
Check title tag, meta description, heading structure, internal links,
and schema markup. Return findings as a prioritized list with file:line
references for every issue. Do not rewrite the page — audit only.

Four lines do the real work:

Field	What it controls	Why it matters
`description`	When the main session delegates	Vague description causes wrong delegations
`tools`	Allowed tool calls (allowlist)	A read-only auditor cannot accidentally rewrite the file it judges
`model`	Which Claude model runs this agent	Lets you tier cost: haiku for mechanical scans, sonnet for judgment calls
Body (system prompt)	Entire personality and instructions	The subagent inherits nothing from your session — it knows only what is written here

The description field is the one most people underinvest in. The main session reads that description to decide whether to delegate. Write it like a job posting for a very specific role: what does this agent do, what triggers it, what does it refuse to do.

What does the isolation actually give you?

The counterintuitive claim: a subagent's amnesia is its primary feature, not a limitation.

Because the subagent cannot see your session, it cannot be biased by it. A reviewer agent that never wrote the code has no ego in the verdict. Because its scratch work disappears when it returns, a subagent can read forty files to answer one question and your main context grows by a single paragraph. You get the conclusion without paying for the investigation in attention tokens.

We measured this in practice. A /seo audit run on a 12-page site typically reads 40-60 files, runs 8-12 Grep calls, and produces a 300-word findings report. If that work happened inline, it would consume roughly 18,000-25,000 tokens of main context. Through a subagent, the main session receives the report — around 600-800 tokens. The reduction is not marginal; it is the difference between a session that stays sharp through a full workday and one that degrades by mid-morning.

The cost of isolation is briefing overhead. The subagent needs the task spelled out precisely, because it cannot infer anything from a conversation it never saw. Delegation pays off when the work is deep and the answer is small. It is overhead when the task requires constant back-and-forth with you.

How do read-only agents differ from orchestrator agents?

This is where v1 and v2 ClaudeKit architecture diverge sharply — and where a lot of community advice is now outdated.

The v1 pattern was: a main orchestrator delegates to worker agents, which pass output to a reviewer/quality-gate agent, which either approves or blocks. It sounds rigorous. In practice it created fragile chains: the gate could block on criteria that did not matter, intermediate context leaked across agents, and the whole pipeline was slow.

ClaudeKit v2 uses read-only specialist agents exclusively. Every agent in the 13-agent bench is an auditor, reviewer, or researcher — never an orchestrator, never a blocking gate. The pattern is:

A slash command does the work and produces an artifact (report, diff, verified file).
An optional read-only agent reviews the artifact if the workflow calls for a second opinion.
The command ends with EVIDENCE, not a reviewer verdict.

The key distinction: the agent reviews but never blocks. The human decides what to do with the finding. We wrote up the full reasoning in why we killed the reviewer gate.

This is also why Claude's Dynamic Workflows preview (up to 1,000 subagents) is interesting for research tasks but less relevant to daily coding and content workflows. At that scale, you need orchestrators. For the work most teams actually do — ship code, write content, audit sites, build videos — 13 focused read-only specialists cover the surface area with far less complexity.

What subagents does ClaudeKit ship?

ClaudeKit v2 ships 13 read-only agents across 5 kits. Here is the full breakdown:

Kit	Agents	Agent role
EngineerKit (/eng)	4	Code reviewer, security auditor, test coverage auditor, architecture reviewer
MarketingKit (/mkt)	2	Copy reviewer (voice rules), content auditor
VideoKit (/video)	3	Script reviewer, scene validator, asset auditor
SEOKit (/seo)	2	SEO auditor, citation researcher
EcomKit (/ecom)	2	Store triage auditor, listing quality reviewer

All 13 are read-only by design. None can write files, run deploys, or send requests. The tool allowlist for every agent is either [Read, Grep, Glob] or [Read, Grep, Glob, WebFetch] — nothing that causes side effects.

The flagship pattern in EngineerKit is worth examining. The /eng review command runs a diff, then delegates to the code reviewer agent with the diff as input. The agent reads the affected files, checks for correctness bugs and risky patterns, and returns findings with file:line references. The command appends those findings to the PR summary as EVIDENCE. No gate, no block — a second opinion baked into the workflow at near-zero overhead.

Compare that to a naive approach: asking the main session to "review this diff" at the end of a long coding session. The model has read every file you touched, watched every decision you made, and is primed to agree with you. The subagent reviewed none of it. It just reads the diff cold.

How do subagents relate to skills and slash commands?

Three terms that frequently get conflated:

Concept	Who invokes it	Lives where	Has own context?
Slash command	You type `/eng debug`	`.claude/commands/`	No — runs in main session
Skill	Model loads automatically	`.claude/skills/`	No — loads into main session
Subagent	Main session delegates	`.claude/agents/`	Yes — isolated context window

They compose well. A slash command can orchestrate multiple subagents. A subagent can use skills loaded into its own context. The pattern ClaudeKit uses most often: a command runs a workflow, hits a judgment-call step, delegates to a read-only specialist, and incorporates the finding into its final output.

The full comparison with examples from every kit lives in agents vs skills vs slash commands.

When should you build your own subagent vs use a kit's?

Write your own subagent when:

You have a recurring judgment task that is specific to your stack or domain.
The task involves reading many files to produce a short verdict.
You want a second opinion that cannot be biased by the session that did the work.
You need cost tiering — a fast cheap model for a mechanical scan, a stronger model only when the fast one flags something.

Use a kit agent when:

The task is domain-standard — code review, SEO audit, copy scoring against a voice guide.
You want a starting point you can fork rather than write from scratch.
You want the agent already wired into a command workflow (the kit commands call their agents automatically).

The install path for any ClaudeKit agent is: ck auth <key> then ck install <kit>. Agents land in ~/.claude/agents/ globally (or use --local for project scope). You can inspect any agent file directly, fork it, and customize it without touching anything else in the kit. The token ledger prints on every install; recount anytime with ck tokens <kit>.

How does model tiering work across subagents?

One underused capability: subagents do not have to run the same model as the main session. ClaudeKit's agents use two tiers:

claude-haiku-4-5 for mechanical read-only scans (check heading structure, count token usage, flag missing schema fields). Fast, cheap, runs in under 3 seconds on typical inputs.
claude-sonnet-4-6 for judgment calls (does this diff introduce a correctness bug? does this copy match the brand voice?).

With Opus 4.8's May 2026 release at $5/$25 per million tokens and a fast mode roughly 3x cheaper, there is now a third tier available for research-heavy agents — citation analysis, competitor benchmarking — where depth matters and latency does not.

The model field in the agent frontmatter is a one-line change. We recommend starting with sonnet everywhere, then downtiering to haiku for any agent where you can verify the output quality holds. Most mechanical audits pass the quality bar on haiku; most copy-scoring tasks do not.

What are the token costs for ClaudeKit's agent setup?

We measured total token footprint for all 5 kits at install time:

Kit	Commands	Skills	Agents	Measured tokens
EngineerKit	25	4	4	20,413
MarketingKit	20	3	2	16,714
SEOKit	19	4	2	16,004
EcomKit	20	3	2	16,464
VideoKit	17	5	3	12,602
Total (all 5)	101	19	13	82,197

These are CLAUDE.md-loaded tokens, not per-run costs. The agents themselves add token overhead only when invoked. A read-only reviewer on a typical diff costs 800-2,400 tokens depending on diff size. The context isolation is why that cost stays low — the agent does not carry your session history.

Full methodology for how we count is in measuring Claude Code context token costs.

FAQ

Are Claude Code subagents the same as Claude's Dynamic Workflows?

No. Claude Code subagents are agents you define in .claude/agents/ that the main session can delegate to — available in any Claude Code session today. Dynamic Workflows is a separate research preview from Anthropic (shipped May 2026) that supports up to 1,000 concurrent subagents for large-scale agentic research tasks. For daily development and content workflows, the agents/ system is the right tool. Dynamic Workflows targets research pipelines and data tasks that genuinely need massive parallelism.

Can a subagent call other subagents?

In principle, yes — an agent running in its own context can delegate further. In practice, chains of agents compound briefing overhead and make debugging difficult. ClaudeKit's v2 design avoids chains entirely. Every agent is a leaf node: it receives input, reads files, and returns a verdict. Nothing delegates to it again. If you find yourself wanting agent chains, that is usually a signal to restructure the work as a multi-step slash command instead.

No. A subagent's context starts fresh every invocation. It does not see your conversation history, your previous commands, or any files the main session has read unless you explicitly pass that content to it. This is the isolation property — and it is why the description field matters so much. The main session decides whether to delegate based only on the agent's description and the current task. Get that description wrong and the agent never gets called, or gets called at the wrong moment.

What is the difference between a subagent and a skill in ClaudeKit?

A skill is passive knowledge loaded into the main session's context — it informs the model's behavior without running as a separate agent. A subagent is an active worker with its own isolated context. Skills are better for domain knowledge you always want available (e.g., your brand voice guide, your codebase conventions). Agents are better for tasks that involve heavy reading or produce a judgment the main session should not be biased toward. Most ClaudeKit commands use both: a skill provides the domain context, and an optional agent provides the independent review.

How do I debug a subagent that is not being called?

Three things to check in order: (1) the description field — is it specific enough about what should trigger this agent? (2) the tools field — does the agent have the tools it needs to complete the task? (3) whether you are running in a session that has agents enabled (check ck doctor for a full diagnostics readout). The most common issue is a vague description. Compare your description to the example above: it names the trigger condition, the scope of work, and what the agent explicitly refuses to do.

Is there a limit on how many subagents a project can define?

No hard limit from Claude Code. Practical limits come from the CLAUDE.md context budget — each agent's definition loads a small amount of metadata into the main session. ClaudeKit's 13-agent bench across all 5 kits adds roughly 6,000-8,000 tokens of agent metadata to the context (rough estimate; actual footprint depends on description length). For most projects, 5-10 focused agents with tight descriptions is the right number. More agents with vague descriptions creates delegation confusion, not capability.

If you want a working bench of read-only specialist agents without building from scratch, EngineerKit ships 4 immediately useful agents wired into a daily-driver command set of 25 workflows. It installs in under two minutes — ck auth <key> then ck install eng — and the agents start showing up in your Claude Code sessions immediately. See pricing for single-kit and All-Access options, or check the getting-started guide for a full walkthrough of your first install.

What Are Claude Code Subagents? (And Why Specialists Beat Generalists)

Why does one generalist agent hit a ceiling?

How is a subagent defined in Claude Code?

What does the isolation actually give you?

How do read-only agents differ from orchestrator agents?

What subagents does ClaudeKit ship?

How do subagents relate to skills and slash commands?

When should you build your own subagent vs use a kit's?

How does model tiering work across subagents?

What are the token costs for ClaudeKit's agent setup?

FAQ

Are Claude Code subagents the same as Claude's Dynamic Workflows?

Can a subagent call other subagents?

What is the difference between a subagent and a skill in ClaudeKit?

How do I debug a subagent that is not being called?

Is there a limit on how many subagents a project can define?

Give Claude Code a real team

Keep reading

Claude Code Agents vs Skills vs Slash Commands: When to Use Which

The Complete Guide to Claude Code Skills in 2026

Why We Killed the Reviewer Gate (and What We Use Instead)

Why does one generalist agent hit a ceiling?

How is a subagent defined in Claude Code?

What does the isolation actually give you?

How do read-only agents differ from orchestrator agents?

What subagents does ClaudeKit ship?

How do subagents relate to skills and slash commands?

When should you build your own subagent vs use a kit's?

How does model tiering work across subagents?

What are the token costs for ClaudeKit's agent setup?

FAQ

Are Claude Code subagents the same as Claude's Dynamic Workflows?

Can a subagent call other subagents?

Do subagents share memory with the main session?

What is the difference between a subagent and a skill in ClaudeKit?

How do I debug a subagent that is not being called?

Is there a limit on how many subagents a project can define?

Give Claude Code a real team

Keep reading

Claude Code Agents vs Skills vs Slash Commands: When to Use Which

The Complete Guide to Claude Code Skills in 2026

Why We Killed the Reviewer Gate (and What We Use Instead)