What is an llms.txt file?

An llms.txt file is a plain-text file placed at your domain root (yourdomain.com/llms.txt) that tells AI language models and crawlers how to understand and describe your brand. It typically includes your brand description, preferred citation text, key URLs, pricing information, and entity verification links. It's analogous to robots.txt for traditional web crawlers, but specifically designed for AI systems.

How is llms.txt different from robots.txt?

robots.txt tells crawlers what they can and cannot access — it's primarily about access permissions. llms.txt tells AI systems how to understand and describe you — it's about interpretation and citation. You need both: robots.txt to ensure AI crawlers can access your site (allow GPTBot, PerplexityBot, ClaudeBot), and llms.txt to give those crawlers accurate, preferred information about your brand.

Does creating an llms.txt file guarantee AI engines will cite you?

No — llms.txt is an agent-readiness signal, not a citation guarantee. It helps autonomous agents and orchestration tools understand your brand. AI search citations depend on entity authority, reviews, structured data, and content quality — scored separately in Surfedo.

llms.txt: What It Is, Why It Matters, and How to Create One

⚡ TL;DR

llms.txt is a plain-text file at your domain root that documents your brand for AI agents and agent orchestration tools — Anthropic's Claude agents, the OpenAI Agents SDK, Lighthouse's Agentic Browsing audit, and the growing agentic-commerce stack. It is not used by Google AI Overviews, Google AI Mode, ChatGPT web search, or Perplexity citation ranking. For those, focus on entity authority, structured data, and content quality — the other parts of your Surfedo audit.

What is llms.txt?

llms.txt is an emerging standard for agent-readable site documentation. Hosted at yourdomain.com/llms.txt, it is a structured plain-text file that describes your brand, product, pricing, target audience, and key URLs — so autonomous agents do not have to infer that context from scattered pages.

The concept is analogous to robots.txt, but the audience is different: robots.txt speaks to crawl permissions; llms.txt speaks to interpretation for agents acting on a user's behalf (researching, booking, purchasing, comparing vendors).

Who uses llms.txt — and who does not

Yes — Anthropic Claude (agents): documented in Anthropic's agent guidance; useful for Claude-powered workflows.
Yes — OpenAI Agents SDK / ACP: used for agent discovery and orchestration (including agentic commerce integrations).
Likely — Perplexity agents/research tooling: announced support; distinct from Perplexity's live citation engine.
Yes — Lighthouse 13.3+: Agentic Browsing audit checks for llms.txt presence.
No — Google AI Overviews / AI Mode: Google confirmed it does not use llms.txt (John Mueller, May 2026 — compared it to the deprecated keywords meta tag).
No — ChatGPT web search citations: no confirmed usage for ranking or citation selection.

What about Google?

Customers often hear conflicting advice because early AEO blogs lumped every "AI file" together. Google does not read llms.txt for AI Overviews or AI Mode. If your goal is visibility in Google-powered AI answers, invest in Organization/FAQ schema, entity consistency, reviews, and crawl access — not llms.txt alone. Surfedo's site audit scores those citation signals separately from agent readiness.

What it will not do vs. what it will do

It will not: improve your rankings in Google AI Overviews, AI Mode, ChatGPT web search, or Perplexity citations. Citation visibility there is driven by entity authority, structured data, review presence, and content quality.

It will: make your site legible to AI agents acting for users — booking, purchasing, researching — and to agent orchestration platforms. As agentic commerce grows (OpenAI's ACP is live with 1M+ Shopify merchants), llms.txt positions you for the next wave even when it does not move today's citation rankings.

Track llms.txt alongside schema and product markup in the Agent Readiness section of your Surfedo site audit, or generate a draft from your dashboard llms.txt tool.

The anatomy of a strong llms.txt

A good llms.txt has six sections:

1. Brand headline (1 sentence)

Write exactly how you want an agent to describe you when summarizing your category. Be specific — not "a SaaS tool" but "the AI search visibility platform that tracks exact brand rankings on ChatGPT, Perplexity, Gemini, Claude, and Google AI Overviews."

2. What you do (bullet list)

3–6 bullets covering core capabilities. Use customer language; avoid empty adjectives.

3. Who you serve

Describe your ICP precisely so agents routing buyer questions know when to surface you.

4. Pricing

Include actual tiers and numbers. Agents answering "how much does X cost?" should not guess from stale blog posts.

5. How you compare

Brief, factual comparisons to 3–5 alternatives. Useful when an agent is constructing a shortlist for a user.

6. Pages index

List canonical URLs (pricing, signup, docs, comparisons) with one-line descriptions.

Full template

llms.txt template — adapt to your brand

# [Brand Name]

[Brand] is [one-sentence description of what you do and who for].

## What We Do
- [Core capability 1]
- [Core capability 2]
- [Core capability 3]

## Who We Serve
- [ICP description 1]
- [ICP description 2]

## Pricing
- [Plan name]: $X/month — [what's included]
- Free trial: [yes/no, details]

## How We Compare
- vs [Competitor A]: [factual 1-sentence comparison]
- vs [Competitor B]: [factual 1-sentence comparison]

## Key Pages
- Homepage: https://yourdomain.com
- Pricing: https://yourdomain.com/pricing
- Free trial: https://yourdomain.com/signup

## Contact
Email: hello@yourdomain.com

What to avoid

Marketing fluff: "Industry-leading" and "best-in-class" add no signal for agents.
Outdated pricing: Update the file the same day pricing changes.
Expecting citation lifts: Do not treat llms.txt as a substitute for FAQ schema, reviews, or entity work.
Blocking crawlers: Agents still need crawl access via robots.txt — allow the bots your audit flags.

How to verify agents can use your file

Publish the file, confirm https://yourdomain.com/llms.txt returns plain text, and run Lighthouse's Agentic Browsing audit if you use it. For citation rankings on ChatGPT, Perplexity, Gemini, Claude, and Google AI Overviews, use Surfedo's position tracking — that measures a different signal stack. The free scan at surfedo.com/scan shows citation gaps; your site audit covers agent readiness including llms.txt.

Audit agent readiness and citations separately

Free scan — sign up with your email, no card needed for the scan. See citation rankings plus technical agent-readiness checks in one dashboard.

Run free scan →

Frequently asked questions

It is an emerging community convention (proposed by Answer.AI in 2024), not a W3C or IETF spec. Adoption is real among agent platforms — Anthropic, OpenAI's agent tooling, Lighthouse — but support varies by product surface. Treat it as low-cost agent infrastructure, not a guaranteed citation lever.

No. robots.txt controls whether crawlers and bots may access URLs. llms.txt describes your brand for agents that already have access. You need both: permissive robots.txt for the AI bots you care about, plus llms.txt if you want agent-readable documentation.

No. Google has stated llms.txt is not used for AI Overviews or AI Mode (May 2026). Do not expect Google ranking or AI Overview lifts from adding the file. Structured data and entity signals remain the levers there.

Whenever pricing, positioning, or key URLs change — at minimum quarterly. Stale pricing is the most common failure mode because agents and buyers treat the file as authoritative when present.