All posts
Comparisons

Claude Code Plugins vs Kits: What the Marketplace Misses

Claude Code plugins give you broad, free, composable parts. Curated vertical kits trade breadth for orchestrated workflows, a quality floor, and measured token cost.

Updated 12 min read
Claude Code Plugins vs Kits: What the Marketplace Misses

The short answer: plugins and marketplace repos give you broad, free, composable components — install one agent here, one skill there. Curated vertical kits make the opposite trade: narrower scope, pre-wired workflows, published token budgets, and read-only specialist agents that ship evidence instead of blocking on a reviewer gate. Neither is universally better. Most serious Claude Code setups end up using both — marketplace plugins for engineering work, a kit for the one business function that needs to ship reliably every time.

Disclosure: we make ClaudeKit — a paid set of vertical kits. We've tried to keep this focused on structural trade-offs and to say clearly where the free marketplace wins.


What Are Claude Code Plugins and Marketplaces, Exactly?

A plugin is a self-contained bundle: agents, slash commands, and skills auto-discovered from a directory, installable individually into Claude Code. A marketplace is a catalog of those plugins. The biggest free option — wshobson/agents, which we reviewed in detail — ships around 192 agents across 84 plugins, covering Python, security, infra, frontend, and more.

The design's core virtue is isolation. Installing one plugin loads only its components, so you can compose a toolchain from many sources without dragging an entire catalog into your context window. That's the right architecture for breadth. It scales to hundreds of agents because the plugin boundary keeps context cost proportional to what you actually install.

For an engineer assembling a coding toolchain, the free marketplace is often genuinely enough. We'll say that plainly throughout this post.

What Does the Agent Skills Open Standard Change?

In December 2025, the Agent Skills open standard launched and was adopted by 32+ tools within 20 days. The skills ecosystem grew 18.5x in that window; there are now roughly 90,000 skills on skills.sh alone. This matters for the plugins-vs-kits conversation because it has dramatically lowered the floor for composable AI tooling: more free components, more formats, more places to find them.

The abundance cuts both ways. More components mean more options — and more variance. A catalog of 90,000 skills from thousands of contributors will have a wide quality distribution. Some are polished and load-bearing. Others are thin wrappers. There is no published quality floor, no measured token budget, and no guarantee that two skills from different authors share any context convention. You are the integrator and the quality reviewer. For experienced engineers comfortable in that role, that's fine. For someone trying to ship a business workflow reliably, it's a meaningful cost.

How Does the Install Model Compare?

The mechanics differ in ways that affect daily use.

DimensionPlugins / MarketplaceClaudeKit Vertical Kit
Unit of installIndividual plugin (agents + skills)One kit (full workflow for one job)
BreadthVery wide, many domainsDeep in one vertical
Install commandVaries by repock install <kit> or /plugin marketplace add Madni-Aghadi/claudekit-<kit>
Context costProportional to plugins installedPublished on install; ck tokens <kit> to recount
OrchestrationYou wire itCommands sequence and fan out
Shared contextRarely; no common conventionBootstrap skills write durable files all commands read
Agent roleRunnable agents per pluginRead-only specialist agents (reviewer / auditor / researcher)
Quality floorNone published; you reviewCommands end with EVIDENCE (diff / report / verified file)
DiagnosticsNoneck doctor for health, ck list for entitlements
CostFreeFrom $14.99/mo — see pricing

Both install cleanly into Claude Code. The difference is what you get inside the install: loose parts versus an assembled, evidence-based workflow.

What Does a Kit Actually Give You That a Plugin Doesn't?

Four structural differences matter.

1. Pre-wired orchestration vs. DIY integration

A marketplace gives you components — individual agents that each improvise from a single persona. What it generally does not give you is an orchestrated workflow: a command that sequences multiple steps, fans them out in parallel, carries shared context across phases, and ends with a concrete deliverable you can verify.

A vertical kit is built around exactly that. Take EngineerKit: /eng debug doesn't just summon a "debug agent." It runs root-cause analysis first, maps call-site impact, proposes a fix, and emits a verified diff — the EVIDENCE the command is designed to produce. The orchestration patterns guide details the five patterns that make this work, and also shows exactly how much effort it takes to build them from scratch. That effort is what you're buying when you buy a kit.

2. Shared context vs. isolated components

In a marketplace toolchain, each plugin tends to operate in isolation. Your audit agent, your copy agent, and your email agent don't share a single source of truth about your brand, store, or codebase — and output drifts between them.

Kits are built on a bootstrap-context spine. Skills like eng-context, mkt-context, ecom-context, and seo-context write durable files once — COMPANY.md, STORE.md, VOICE.md, SITE.md — and every command reads them. So the technical spec, the changelog post, and the social repurpose all reference the same facts. You could replicate this across marketplace plugins if they happen to share a file convention. Usually they don't, because independent plugins from different authors weren't designed to read a common context file.

3. Evidence-based output vs. you-are-the-reviewer

This is the structural difference most worth understanding. Earlier versions of ClaudeKit (v1) shipped a blocking reviewer gate — a quality agent that scored deliverables 0–100 and looped until they cleared 90. We killed that architecture. It was slow, it added tokens to every run, and it was solving the wrong problem: a reviewer agent reviewing an agent's own output is not a real quality check.

The v2 design is different: every command ends with EVIDENCE. /eng verify emits a test report. /seo audit emits a ranked finding list with fix priority. /ecom no-sales emits a triage report benchmarked against AOV-band data. /mkt humanize strips 14 measurable AI tells and shows you what changed. The evidence is the quality floor — you can see it, verify it, and decide. No blocking gate, no extra latency, no agent reviewing itself. The read-only specialist agents in v2 — reviewer, auditor, researcher — are non-blocking: they inform, they don't gate.

4. Published token budgets vs. discovered costs

Marketplaces generally don't publish the token footprint of each component — you find out by installing and loading. ClaudeKit publishes the measured context cost of every kit before you commit:

  • EngineerKit: 20,413 tokens (25 commands, 4 skills, 4 agents)
  • MarketingKit: 16,714 tokens (20 commands, 3 skills, 2 agents)
  • SEOKit: 16,004 tokens (19 commands, 4 skills, 2 agents)
  • EcomKit: 16,464 tokens (20 commands, 3 skills, 2 agents)
  • VideoKit: 12,602 tokens (17 commands, 5 skills, 3 agents)

Total across all 5 kits: 82,197 measured tokens, 101 commands, 19 skills, 13 read-only agents. The token ledger prints on every ck install. Run ck tokens <kit> to recount anytime. For cost-aware teams tracking Claude API spend, this matters more than it might look.

When Is the Marketplace Genuinely Enough?

We'll be direct, because the useful version of this post tells you when not to pay us.

The marketplace alone is the right call when:

  1. You are a software engineer assembling a coding toolchain. The free repos' development, infra, and security coverage is strong, and you are a capable reviewer of your own output.
  2. Your needs are horizontal — a good agent here, a good skill there — rather than one deep business workflow.
  3. You work across multiple AI harnesses and value plugins that export to Cursor, Codex, and Gemini from one source.
  4. You enjoy (or don't mind) being your own integrator, and context isolation is sufficient cost control.
  5. Your budget is zero and your output doesn't go to external stakeholders who'll hold you to it.

A curated kit earns its price when:

  1. Your need is a business function — engineering operations, SEO, marketing, ecommerce, or video production — that you want to ship reliably end to end, not assemble from parts.
  2. You want token costs budgeted before you run, not discovered after.
  3. You'd rather buy the orchestration than build and maintain it as your workflow evolves.
  4. You want read-only specialist agents that audit and report rather than block and loop.

Often the answer is both. Marketplace plugins for engineering work, a vertical kit for the one business function you actually need to be dependable. They coexist in the same Claude Code setup without conflict.

What Are the Real v2 Kit Workflows Worth Knowing About?

Each kit has a flagship command worth naming specifically, because the marketing copy alone doesn't convey what "orchestrated workflow" actually means in practice.

EngineerKit (/eng): The daily-8 commands cover the full dev loop — catchup, plan, tdd, debug, verify, review, commit, handoff. The flagship is /eng debug, which runs root-cause analysis before surfacing symptoms. It doesn't just find the error; it maps call-site impact and proposes a minimal fix. Combined with /eng verify (which emits a test report as evidence), you get a tight debug-verify loop that most marketplace debug agents don't match for depth.

MarketingKit (/mkt): Two flagships. /mkt voice builds a voice file from your real posts — actual examples, not a style description you typed — so every subsequent command writes in your voice. /mkt humanize strips 14 measurable AI tells from a draft. Also worth noting: mkt-repurpose turns one piece of content into 5 formats, and mkt-calendar builds a content plan from your existing assets.

SEOKit (/seo): Flagship is /seo quick-wins — it targets positions 8–20 and low-CTR pages, where a small content update often yields disproportionate traffic. The other one worth knowing: /seo citations runs N-iteration AI-citation measurement with confidence intervals, which is relevant given that AI Overviews now appear on 48% of Google queries (March 2026, up from 34.5% in December 2025).

EcomKit (/ecom): Flagship is /ecom no-sales — a store triage command benchmarked against AOV-band data. It doesn't just list problems; it ranks them by revenue impact. Other workhorses: ecom-cart-recovery, ecom-amazon, ecom-margin, ecom-bfcm.

VideoKit (/video): Flagship is /video clone — it recreates a reference video's style in Remotion and verifies the match. Built for teams producing programmatic video at scale. The video-data and video-social commands handle the distribution side.

How Does Pricing Work?

ClaudeKit pricing in 2026:

  • Single kit: $14.99/month
  • Pro (any 3 kits, swap 1 per cycle): $29.99/month
  • All-Access (all 5 kits): $49.99/month, or $399/year
  • Lifetime (one-time, per kit, as shipped — no future updates): $99

Annual plans for single and Pro are $119/year and $239/year respectively. 14-day refund window. 3 devices per license. Install via ck auth <key> then ck install <kit>, or through the Claude Code plugin marketplace with /plugin marketplace add Madni-Aghadi/claudekit-<kit>. Full details at /pricing.

The CLI is claudekits v0.1.3 on npm. ck doctor diagnoses install issues. ck list shows your entitlements. One thing to note: lifetime licenses are for the kit as shipped — if you want updates, the subscription is the right choice.

What Should I Actually Do?

If you're reading this to make a decision, here is the direct version:

Start with the free marketplace. Browse wshobson/agents, install the plugins that cover your engineering work, and use Claude Code for a few weeks. Get a feel for where you're acting as integrator and reviewer and where that's fine.

Then look at whether you have one business function — SEO, marketing, ecommerce, video — that you're running manually or patching together from components. That's the gap a kit fills. Not your entire Claude Code setup; just that one vertical. The getting-started guide shows how to install alongside any existing plugins you already have.

If SEO is the function you're trying to systematize, SEOKit is the right starting point — especially given where AI search is heading in 2026. If you're running a DTC store, EcomKit is worth a look before you build your own triage workflow from scratch.


FAQ

Are Claude Code plugins free?

Yes. The major plugin marketplaces are free and open — wshobson/agents alone ships 192 agents at no cost. What's free is the components; what you supply yourself is the orchestration, shared context, and quality review. Curated vertical kits charge (from $14.99/month) for packaging those into an assembled workflow where commands end with verifiable evidence rather than requiring you to review agent output yourself.

What does a kit give me that a plugin doesn't?

Four things the marketplace model structurally tends to miss: pre-wired orchestration (commands that sequence multiple steps and end with evidence), shared context (bootstrap skills write durable files every command reads, so output stays consistent), published token budgets you can plan against, and read-only specialist agents that audit and report rather than block. A plugin gives you a competent component; a kit gives you the assembled workflow around it.

Can I use plugins and kits together?

Yes, they coexist in the same Claude Code setup without conflict. A common setup is free marketplace plugins for engineering work — where you're a capable reviewer of your own code — plus a vertical kit for the one business function you need to ship reliably to external stakeholders. They're not mutually exclusive, and no kit replaces your entire toolchain; only the vertical it's built for.

What happened to the blocking reviewer gate in v2?

We removed it. V1 shipped a quality-gate agent that scored deliverables 0–100 and looped until they cleared 90. The problem: an agent reviewing its own output is not a real quality check, and it added latency and tokens to every run. V2 takes a different approach — every command ends with EVIDENCE (a diff, a report, a verified file) that you can inspect directly. Read-only specialist agents in v2 inform and audit; they don't block.

How do I know how many tokens a kit will use before installing?

Run ck tokens <kit> after installing, or check the published numbers: EngineerKit is 20,413 tokens, MarketingKit 16,714, SEOKit 16,004, EcomKit 16,464, VideoKit 12,602. The token ledger also prints automatically on every ck install. These are measured counts (tiktoken-compatible, approximately 4 chars/token), not estimates. For context, loading all 5 kits simultaneously would cost 82,197 tokens — in practice most users load one or two.

Is the marketplace ever enough on its own?

Often, yes. If you're an engineer with horizontal needs, comfortable acting as your own integrator, and your work is primarily code rather than customer-facing deliverables, the free marketplace's breadth and context isolation are genuinely sufficient. The honest dividing line is whether you need a deep, pre-wired, end-to-end business workflow — that's where a kit's orchestration and evidence-based output earn the price. The wshobson/agents review gives an honest assessment of the biggest free option available today.

Give Claude Code a real team

Five kits, 101 commands, every token measured. Pick the team that matches your work and install it in five minutes.

See the kits

Keep reading