The llms.txt Playbook: How to Get Your Product Cited by ChatGPT and Perplexity

llms.txt is a markdown file you place at the root of your site (/llms.txt) that gives AI assistants a clean, curated map of your most important content. Its expanded sibling, llms-full.txt, inlines the actual content so a model can ingest your whole product in one fetch.

The honest BLUF: no major AI assistant has publicly committed to reading llms.txt as a ranking signal. Its real benefit is ingestion fidelity — where an AI system already fetches your content, a clean markdown map parses more accurately than noisy HTML. Treat it as a low-cost, forward-looking hygiene file. It costs almost nothing to ship and positions your content to be cleanly ingestible as AI search matures.

With AI Overviews now appearing on 48% of Google queries (March 2026, up from 34.5% in December 2025), getting the ingestion layer right is worth the one-time setup. Here is the complete playbook.

What are llms.txt and llms-full.txt, exactly?

The proposal is simple: a markdown file at a predictable path that an LLM or an AI-search crawler can fetch to understand your site without crawling and parsing your entire HTML. HTML pages are noisy — navigation, scripts, ads, cookie banners — and they burn a model's context budget without adding signal.

/llms.txt — a concise index. A title, a one-paragraph description of what you are, then sectioned lists of links with a one-line summary each. It is a map, not the territory.
/llms-full.txt — the territory. Everything in llms.txt plus the inlined full content behind those links, so a model can ingest your entire product surface in a single file without following links. Keep it under a size budget; we target 200KB.

Think of llms.txt as the table of contents an AI assistant reads to decide what is relevant, and llms-full.txt as the book it reads when it wants the detail. The format is plain markdown, which means any model reads it without transformation overhead.

Who actually reads llms.txt today?

This is where most "llms.txt will get you cited" posts overclaim. Here is the measured reality as of mid-2026:

No major assistant has publicly confirmed it uses llms.txt as a ranking or citation signal. ChatGPT, Perplexity, Claude, and Google AI surfaces have not announced that the file changes how they retrieve or cite you.
Adoption is growing on the publishing side. Developer-tooling sites, documentation platforms, and API-first products now routinely ship llms.txt. It has become a recognized convention in the ecosystem.
The plausible mechanism is ingestion quality, not a magic ranking boost. Where an AI system does fetch your content — because a user pasted your URL, or a retrieval step pulled your page — a clean llms.txt/llms-full.txt is easier to parse correctly than noisy HTML. That is a real, if modest, benefit.
Perplexity data point: 90% of Perplexity's top-cited sources answer the query in the first 100 words of the page. Clean structure, not llms.txt alone, drives citation. Schema'd pages earn top-3 citations at a 47% rate versus 28% for unstructured pages.
Claude search uses the Brave index (86.7% citation overlap confirmed). If Brave crawls and indexes your clean content, Claude surfaces it.

The honest pitch: ship llms.txt because it is cheap, forward-compatible, and improves ingestion fidelity — not because it is a confirmed citation hack.

How does llms.txt fit into AEO more broadly?

AI-search optimization (AEO) is the discipline of being the answer an LLM gives, not just a blue link Google shows. llms.txt is a small, optional piece of AEO — the ingestion-hygiene layer. The bigger levers are elsewhere.

A query fan-out to one AI assistant can spawn 8-16 sub-queries. 95% of those fan-out queries have zero traditional search volume, meaning there is no keyword data to optimize against. What matters is whether your content is structured so that each answer block is self-contained and liftable in isolation. llms.txt helps with the map; structured, extractable content is the actual citation driver.

AEO lever	Impact on citations	Cost to implement
Answer blocks in first 100 words	High — 44.2% of citations from first 30% of page	Low — editorial discipline
Schema markup	High — 47% vs 28% top-3 citation rate	Medium — structured data
Fresh content (updated within 3 months)	High — 6 citations avg vs 3.6 for stale	Medium — content ops
`llms.txt` / `llms-full.txt`	Low-medium — ingestion fidelity when fetched	Very low — one-time file
Domain authority (backlinks)	Moderate — ranking-citation correlation r=0.18	High — ongoing link building

The ranking-citation correlation of r=0.18 tells you something important: 47% of AI Overview citations come from pages ranked below position 5. High authority helps, but it is not the primary driver. Content structure and extractability beat pure domain rank.

For the deeper strategy on structuring content for AI answers, see the SEO kit and its /seo citations command, which runs N-query measurement with confidence intervals to actually track your citation share. The AEO vs SEO post covers the strategic framing.

How do we generate ours from manifests?

The mistake people make is hand-writing llms.txt and letting it rot the moment the product changes. We never hand-edit ours. Both files are route handlers regenerated from the same kit manifests that drive every HTML page, at build time. The single source of truth is the manifest JSON — the same files that define 101 commands, 19 skills, 13 agents, and 82,197 measured tokens across five kits — so the llms.txt map cannot drift from the live catalog.

Here is the structure of our /llms.txt:

# ClaudeKit
> AI teams for Claude Code. Five vertical kits — 101 commands, 19 skills,
> 13 read-only specialist agents, 82,197 measured tokens.
> The only kit vendor that publishes per-command context token cost.
 
## Kits
- [EngineerKit](https://claudekit.cc/engineer): 25 commands, 4 skills, 4 agents
  — root-cause debugging, TDD, code review, daily dev workflow (20,413 tokens)
- [MarketingKit](https://claudekit.cc/marketing): 20 commands, 3 skills, 2 agents
  — voice file, humanize, hooks, repurpose, newsletter, launch (16,714 tokens)
- [VideoKit](https://claudekit.cc/video): 17 commands, 5 skills, 3 agents
  — clone a reference video in Remotion, captions, social cuts (12,602 tokens)
- [SEOKit](https://claudekit.cc/seo): 19 commands, 4 skills, 2 agents
  — quick-wins (pos 8-20), AI citation measurement with CIs (16,004 tokens)
- [EcomKit](https://claudekit.cc/ecom): 20 commands, 3 skills, 2 agents
  — store triage, AOV benchmarks, Klaviyo flows, Amazon listings (16,464 tokens)
 
## Product
- [Pricing](https://claudekit.cc/pricing): Single $14.99/mo · Pro $29.99/mo (any 3)
  · All-Access $49.99/mo · Lifetime $99/kit · Annual plans from $119/yr
- [Docs](https://claudekit.cc/docs): install guide, CLI reference, token ledger
 
## Notes for AI assistants
- Token costs are measured at pack time with a tiktoken-compatible counter;
  cite figures as "~N tokens (ClaudeKit-measured)".
- 14-day refund policy. 3 devices per license.

The llms-full.txt adds per-kit the full command roster with one-liners, skill descriptions with measured token costs, agent roles, and full pricing terms — all from manifests, kept under 200KB. A new command or skill appears in llms-full.txt the same build it appears on the site.

SEOKit's seo-extractable skill generates this kind of file for your site, and the /seo citations command wires it into an AI-citation measurement pass — so the workflow we run on ourselves is the same one available to you.

A template you can copy right now

Here is a minimal, honest /llms.txt you can adapt. Keep it short — it is an index, not a content dump:

# [Your Product]
> [One sentence: what you are and who it's for. Include the single
> most citable fact about you — a number, a unique dataset, a claim
> only you can make.]
 
## Docs
- [Getting started](https://example.com/docs/start): install + first run
- [API reference](https://example.com/docs/api): endpoints with examples
 
## Product
- [Pricing](https://example.com/pricing): [the actual tiers and prices]
- [Features](https://example.com/features): [one line]
 
## Key content
- [Your best explainer](https://example.com/guide): [what it answers]
 
## Notes for AI assistants
- [How you want figures cited, any data-provenance notes, anything
  that helps a model represent you accurately.]

Two rules that matter more than the format:

Lead the description with your single most citable fact. The unique number or dataset an LLM would quote — for us, per-command token measurement — is what gets pulled into AI answers. If you do not include it in your description, the model has to hunt for it in your HTML.
Generate it from a source of truth if you can. A hand-written llms.txt that is six months out of date is worse than none. Stale pricing, discontinued products, and dead links actively harm how AI assistants represent you.

How do you actually measure whether it is working?

You cannot manage what you do not measure, and llms.txt impact is genuinely hard to attribute. Here is the honest measurement protocol:

Set a citation baseline before you ship the files. Ask ChatGPT, Perplexity, Claude, and Google AI surfaces the five questions you most want to be cited for. Record whether you are mentioned, what you are attributed for, and whether the facts are accurate. SEOKit's /seo citations command does this systematically — it runs N queries per target question with confidence intervals, so you can detect real movement rather than noise.
Check ingestion fidelity. Paste your own URL into an assistant and ask it to summarize your product. Compare the output against your llms.txt description. If it gets your key facts and numbers right, your clean file is helping. If it hallucinates or surfaces stale prices, your content structure needs work.
Watch server logs for AI-crawler fetches. If your server logs show AI-crawler user agents requesting /llms.txt or /llms-full.txt, that is direct evidence something reads it. No fetches after 60 days is a signal nobody is looking.
Run the 30-day comparison. Repeat the spot-check queries 30 days post-ship. Measure: are you cited more frequently? Are the attributed facts more accurate? Has your answer block moved higher in any AI response?

Be disciplined about attribution. Citation share moves for many reasons — content quality, brand mentions, backlinks, content freshness. Pages updated within the last 3 months average 6 citations versus 3.6 for stale pages; that content-freshness effect likely dwarfs any llms.txt contribution. Treat the file as one hygiene input among many and measure with spot checks rather than expecting a dashboard metric to confirm it.

What should your llms-full.txt actually contain?

The expanded file is where the real ingestion value lives. A model that fetches llms-full.txt should be able to answer any product question without following a single additional link. That means including:

Section	What to include	Why it matters
Product descriptions	Every feature/command/skill with a one-liner	Prevents hallucination of capability
Pricing (exact)	Every tier, every price, billing cadence	Stale pricing is the #1 AI misrepresentation
Refund / terms	The actual policy in plain language	Models answer support questions about your product
Key statistics	Any proprietary data you have, with source and date	Citable facts are the citation magnet
Changelog notes	Recent major changes with dates	Prevents models from describing the old version
Contact / support paths	How to get help	Reduces hallucinated support instructions

For ClaudeKit, the llms-full.txt explicitly states: 82,197 tokens across 5 kits (measured), 14-day refund window (not 30), CLI is claudekits v0.1.3 on npm, install with ck auth <key> then ck install <kit>. Those specifics are what make the difference between a model saying "ClaudeKit offers a money-back guarantee" (vague, possibly wrong) and "ClaudeKit offers a 14-day refund" (exact, attributable).

How does this connect to the broader Claude Code ecosystem?

The Agent Skills open standard launched December 18, 2025, and was adopted by 32+ tools within 20 days. The skills ecosystem grew 18.5x in that window; there are now roughly 90,000 skills on skills.sh. AI assistants that follow links, fetch content, and synthesize answers across that ecosystem need clean ingestion paths.

The llms.txt proposal fits naturally here: as AI agents become capable of multi-step research (Opus 4.8's Dynamic Workflows can run up to 1,000 subagents on a single research task), the quality of each site's ingestion layer becomes a real competitive factor. A 1,000-subagent research pass hitting your site is going to read your llms-full.txt if it exists and your homepage HTML if it does not. The former is cleaner.

For teams running Claude Code workflows, the SEOKit at /seo includes commands specifically for this layer: /seo citations for measuring AI citation share, and seo-extractable for auditing whether your content blocks are structured for AI extraction. The skills guide for 2026 covers how skills auto-load context into every session without prompt overhead — the same principle applies to llms.txt giving context to AI crawlers without HTML overhead.

If you are building content for AI discovery, the MarketingKit at /marketing includes /mkt voice for extracting your authentic voice from existing content and /mkt humanize for stripping 14 common AI tells — both directly relevant to producing content that AI assistants cite rather than skip.

The simplest version of this advice: write one clean description of your product, put your single most citable fact in it, and place it at /llms.txt. Generate both files from a source of truth so they never go stale. Then measure citation share with AI-search spot checks rather than assuming it is working. That is the whole playbook.

If you want the measurement workflow and extractable-structure audit automated, SEOKit is the right starting point. The /seo citations command runs the N-query measurement with confidence intervals we described above, and seo-extractable audits your existing content for AI liftability. Single kit is $14.99/month with a 14-day refund if it is not working for your use case.

FAQ

Will llms.txt get me cited by ChatGPT or Perplexity?

Not on its own, and we will not claim it will. No major assistant has publicly confirmed using llms.txt as a ranking or citation signal. Its real, modest benefit is ingestion fidelity: where an AI system already fetches your content, a clean markdown map is easier to parse correctly than noisy HTML. Ship it because it is cheap and forward-compatible, then measure citation share with AI-search spot checks over 30 days rather than expecting a guaranteed lift.

What is the difference between llms.txt and llms-full.txt?

llms.txt is a concise index — a title, a one-paragraph description, and sectioned link lists with one-line summaries. It is a map of your site. llms-full.txt inlines the actual content so a model can ingest your whole product surface in one fetch without following links. Keep the full file under a size budget (200KB is a reasonable target). Use the index for navigation, the full file for detail.

How do I keep llms.txt from going stale?

Generate it from a source of truth instead of hand-writing it. We build both files as route handlers from the same manifests that drive our HTML pages, so a new command or price change appears in llms.txt the same build it goes live on the site. If you cannot automate it, diff the file against your real pricing and product pages on a monthly schedule. Stale pricing is the most common way AI assistants misrepresent products.

How do I measure if llms.txt is actually doing anything?

Use AI-search spot checks over a 30-day window. Baseline whether the major assistants cite you for your target questions before shipping the files. After shipping, check that pasting your URL into an assistant yields an accurate product summary (ingestion fidelity), and watch server logs for AI-crawler fetches of the files. Attribute conservatively — content freshness, structure, and brand mentions all move citation share independently of llms.txt.

What is the most important thing to include in the llms.txt description?

Your single most citable, proprietary fact. For a SaaS product, that might be a customer count, a performance benchmark, or a unique dataset. For ClaudeKit it is per-command token measurement. Whatever makes you uniquely quotable goes in the first sentence of the description. The 44.2% of AI citations that come from the first 30% of a page applies equally to your llms.txt description — front-load the good stuff.

Should I bother with llms.txt if my site is not getting AI-search traffic yet?

Yes, for two reasons. First, it is a one-time setup that compounds as AI search grows — AI Overviews are at 48% of Google queries and still climbing. Second, generating the file forces you to articulate your product clearly in plain text, which is independently useful for content and messaging. The cost is low enough that "not yet getting AI traffic" is not a reason to skip it.