what is llms.txt

llms.txt is a proposed web standard that allows website owners to place a plain-text file at the root of their domain — /llms.txt — containing structured information about their site's content, purpose, and preferred handling by AI language models. Think of it as the AI-age equivalent of robots.txt: a human-readable file that signals to automated systems how they should interact with your website. Unlike robots.txt, which tells search crawlers where they can and cannot go, llms.txt is designed to give large language models (LLMs) useful context about who you are and what your content represents.

Where the Idea Came From

The llms.txt concept was proposed in 2024 by developer Jeremy Howard as a lightweight, open standard for helping AI tools understand websites more efficiently. The problem it was trying to solve is a real one: modern AI crawlers often face enormous websites with thousands of pages, and without guidance, they have no easy way to know which pages are authoritative, which are outdated, or what the site is fundamentally about.

A well-constructed llms.txt file can answer those questions in seconds, without requiring the crawler to traverse the entire site. It can include a brief description of the organisation, links to key documentation pages, a summary of the site's primary topics, and instructions about how the content should be interpreted.

What Goes in an llms.txt File?

The format is intentionally simple — plain markdown that any developer or content manager can write without specialist tools. A basic llms.txt file might include:

  • A short description of the website and organisation.
  • The primary topics or categories the site covers.
  • Links to the most important pages or documentation sections.
  • Any guidance about how content should be attributed or used.

More advanced implementations may also include a companion llms-full.txt file containing the full text of the site's most important pages, pre-formatted for LLM consumption. This allows AI tools to ingest your core content without crawling hundreds of individual URLs.

How Widely Is It Used?

Adoption has grown steadily since the concept was proposed. A comprehensive study of nearly 300,000 domains found that approximately 10.13% had an llms.txt file in place by the time of measurement. Early adopters were concentrated in technically sophisticated sectors — cybersecurity, developer-facing SaaS, and blockchain — where teams were already comfortable working at the intersection of web infrastructure and AI.

By early 2026, adoption had expanded into mainstream publishing, broader SaaS, and some consumer sectors. However, sectors with significant legal or compliance sensitivities — financial services, healthcare, and legal — have remained notably slow to adopt the standard.

Against the backdrop of broader AI activity on the web, the growth in llms.txt adoption sits alongside a dramatic increase in AI crawler traffic. AI-related bot activity increased by over 300% between January 2025 and March 2026. Major AI crawlers including GPTBot (OpenAI), ClaudeBot (Anthropic), PerplexityBot, and Google-Extended are now regular visitors to most large websites.

Does It Actually Work?

Here is where honest reporting matters: the evidence for llms.txt's practical effectiveness is currently limited. A study by SERanking covering 300,000 domains found that having an llms.txt file does not measurably improve AI citations. More significantly, the same research found that AI search crawlers are almost never fetching /llms.txt directly — they overwhelmingly crawl HTML pages rather than reading the guidance file.

None of the major LLM providers — OpenAI, Anthropic, Google, Meta, or Mistral — has publicly committed to using llms.txt as a signal in their production search or answer surfaces. This means that while the file may have future value as the standard develops, it should not currently be treated as a reliable lever for improving AI visibility.

This does not mean implementing llms.txt is a waste of time. It takes very little effort, it signals technical awareness to the web community, and if major AI providers do adopt it as a standard in the future, early adopters will have a head start. But it should be treated as a low-priority housekeeping task rather than a core GEO strategy.

How to Create an llms.txt File

If you want to add an llms.txt file to your website, the process is straightforward:

  • Create a plain text file called llms.txt and place it at the root of your domain (e.g., https://yoursite.com/llms.txt).
  • Write a brief description of your site (two to three sentences) in plain language.
  • List the URLs of your five to ten most important pages, with a short description of each.
  • Optionally, note any topics your site covers, your organisation's name, and how you prefer to be cited.
  • If your site has detailed documentation or a knowledge base, consider creating a companion llms-full.txt with the full text of those pages.

There is no official validation tool for llms.txt, but several community-built tools have emerged to check whether a file follows the proposed specification. A simple search for "llms.txt validator" will surface current options.

llms.txt Versus robots.txt: What's the Difference?

It is easy to confuse llms.txt with robots.txt, but they serve different purposes. robots.txt is a long-established, widely respected standard that tells crawlers which pages they are allowed to access. Major search engines and AI crawlers actively read and respect robots.txt rules.

llms.txt, by contrast, is not about access control — it is about context. You are not telling AI tools where they can or cannot go; you are giving them background information to help them understand your content better. The two files complement each other and can exist on the same domain simultaneously.

If your goal is to prevent AI crawlers from indexing your content entirely, robots.txt (with the relevant bot user-agent rules) is the tool to use. If your goal is to help AI tools understand and correctly represent your content, llms.txt is the right approach — even if its practical impact remains to be proven at scale.

The Bigger Picture

llms.txt is best understood as part of a broader effort by the web community to develop standards for the AI era. Just as robots.txt and sitemap.xml evolved in response to the rise of search engines, new conventions are emerging to address the needs of AI crawlers and generative answer engines.

Whether llms.txt becomes a dominant standard, gets replaced by something else, or quietly fades, the underlying principle is sound: websites benefit from being legible to automated systems. The more clearly and efficiently your site communicates what it is about, the more likely AI tools are to represent it accurately. That goal is worth pursuing regardless of which specific file format ultimately wins.

Free · No credit card required

Ready for your AI score?

See how visible your site is to ChatGPT, Perplexity & Gemini.

Start FREE audit

Results in minutes · 100% free