What is llms.txt and How Does It Work? | Cited

what is llms.txt

Quick Answers

llms.txt is a proposed web standard — a plain-text file placed at your domain root (/llms.txt) — that gives AI language models structured information about your site's content, purpose, and key pages. Think of it as the AI-era equivalent of robots.txt, but for providing context rather than controlling access. Proposed by developer Jeremy Howard in 2024, the file has been adopted by approximately 10.13% of domains in a 300,000-domain study (SERanking, 2026). However, AI crawlers including GPTBot, ClaudeBot, and PerplexityBot rarely fetch the file in practice — they overwhelmingly crawl HTML directly. No major LLM provider has publicly committed to using llms.txt as a citation signal. It is worth implementing (it takes minutes) but should not be treated as a primary GEO strategy.

You may have heard that adding an llms.txt file to your website helps AI tools understand your content. This article explains what llms.txt actually is, how it works in practice, who's already using it — and whether it's genuinely worth the effort for your site.

What is llms.txt?

llms.txt is a proposed open web standard that allows website owners to place a plain-text file at their domain root — accessible at /llms.txt — containing structured information about their site's content, purpose, and key pages. The format is designed for AI language models rather than traditional search crawlers. Where robots.txt controls which parts of a website automated systems can access, llms.txt provides contextual information to help AI tools understand what a site is about and which pages matter most.

The concept draws on the same logic that made robots.txt and XML sitemaps valuable: giving automated systems a reliable, low-cost way to interpret a website without having to crawl every page. In theory, an AI tool that reads your llms.txt can form a more accurate understanding of your business, your content, and how it should be represented in generated answers.

Where Did llms.txt Come From?

llms.txt was proposed in September 2024 by Jeremy Howard, the developer and AI researcher known for co-founding fast.ai. Howard published a draft specification outlining the format and intended use case, and the proposal quickly attracted attention from the web and AI communities. The standard is not affiliated with any single AI company and has not been formally adopted by any standards body — it remains a community-driven proposal rather than an official requirement.

What Goes in an llms.txt File?

The format is intentionally simple — plain markdown that any developer or content manager can write without specialist tools. A standard llms.txt file typically includes a brief description of the organisation, the primary topics the site covers, links to the most important pages or documentation sections, and any guidance about how content should be attributed or used.

More advanced implementations may include a companion llms-full.txt file containing the complete text of the site's most important pages, pre-formatted for LLM consumption. This allows AI tools to ingest core content without needing to crawl individual URLs. Neither file requires validation against a schema — the format is flexible by design.

How Widely Has llms.txt Been Adopted?

Adoption has grown steadily since the proposal was published. A comprehensive analysis of nearly 300,000 domains by SERanking found that approximately 10.13% had an llms.txt file in place at the time of measurement. Adoption rates were remarkably consistent across website traffic tiers: 9.88% for low-traffic sites, 10.54% for mid-traffic sites, and 8.27% for high-traffic sites, suggesting the format appeals broadly rather than skewing towards any particular scale of website.

Early adopters were concentrated in technically sophisticated sectors — cybersecurity, developer-facing SaaS, and blockchain — where teams were already comfortable working at the intersection of web infrastructure and AI. By early 2026, adoption had expanded into mainstream publishing and broader SaaS categories. The growth in llms.txt adoption runs alongside a dramatic increase in AI crawler traffic overall: AI-related bot activity increased by over 300% between January 2025 and March 2026 (Digital Applied, 2026).

Does llms.txt Actually Improve AI Citations?

The honest answer, based on current evidence, is: probably not yet. The SERanking study of 300,000 domains found that having an llms.txt file does not measurably improve AI citations. More significantly, analysis of actual AI crawler behaviour found that GPTBot, ClaudeBot, PerplexityBot, OAI-SearchBot, and Google-Extended overwhelmingly skip the file and crawl HTML pages directly. The file is present on those sites — the crawlers simply are not reading it.

None of the major LLM providers — OpenAI, Anthropic, Google, Meta, or Mistral — has publicly committed to using llms.txt as a signal in their production search or answer surfaces. Until that changes, llms.txt cannot reliably be expected to improve how AI tools represent your website. This does not make it entirely useless — it is low-effort to implement, it signals technical awareness, and if major providers do adopt the standard, early implementors will benefit — but it should be kept in perspective relative to higher-impact GEO activities.

How Do I Create an llms.txt File?

Creating an llms.txt file is straightforward and takes less than 30 minutes for most websites.

Create a plain text file named llms.txt and place it at the root of your domain (e.g. https://yoursite.com/llms.txt).
Write a brief description of your site and organisation — two to three sentences in plain language.
List the URLs of your five to ten most important pages, with a one-line description of each.
Optionally note your primary topics, your organisation's name, and your preferred citation format.
If your site has detailed documentation or a knowledge base, consider creating a companion llms-full.txt with the full text of those pages pre-formatted in markdown.

There is no official validation requirement. Several community-built tools are available to check whether a file follows the proposed specification — a search for "llms.txt validator" will surface current options.

How Does llms.txt Differ from robots.txt?

robots.txt and llms.txt serve different purposes and should not be confused. robots.txt is a long-established, universally respected standard that controls crawler access — it tells automated systems which pages they are allowed to index. Major search engines and AI crawlers actively read and respect robots.txt. If you want to block AI crawlers from indexing your content, robots.txt (with the relevant user-agent rules) is the correct tool.

llms.txt, by contrast, is not about access — it is about context. You are not instructing AI tools where they can or cannot go; you are providing background information to help them understand your content more accurately. The two files are complementary and can coexist on the same domain. Choose the right one based on your goal: access control belongs in robots.txt, contextual guidance belongs in llms.txt.

Back to Insights

Free · No credit card required

Ready for your AI score?

See how visible your site is to ChatGPT, Perplexity & Gemini.

Start FREE audit

Results in minutes · 100% free