llms.txt llms.txt
Imagine my shock and surprise to discover that when I submit resumes now, I quickly see LLM research agents in my access logs pouring over my portfolio page. As time moves ever forward, I suspect I’ll be dealing with AI agents more frequently, especially in the job market.
I’m not thrilled about this, but I can’t do anything about it - so, if you can’t beat them, join them.
My website is already fairly straight-forward HTML and CSS, but in my own testing, models still seem to get confused about content on my site - theres a lot to parse, even on a simple site.
Luckily, a new standard is being formed for organizing web content for llms into simple “.txt” files. If you haven’t seen them before, they’re essentially markdown versions of page content, no styling, no javascript.
This article explains how to generate these files auto-magically from existing content, for Hugo SSG sites.
The Flow
We want to be able to:
- Provide clean, plain-text versions of all content
- Allow ourselves to write custom versions when needed
- Auto-generate LLM-optimized content in all other cases
- Create a master index of all LLM ready content
- Do as little manually as possible.
If you’re not familiar with Hugo’s Template system, it’s incredibly powerful, and can do a lot more then just layout HTML files. We’re going to use it today to generate llms.txt files automatically, falling back to our custom content in all other cases.
Important pages like the work portfolio benefit from custom llms.txt
files containing hand-crafted, LLM-optimized content. These files live alongside the main content:
content/
info/
work/
index.en.md # Human-readable page
llms.txt # Custom LLM version
Custom files provide complete control over LLM-visible content, enabling:
- Remove visual elements that don’t translate to text
- Restructure information for better LLM comprehension
- Include additional context or explanations
- Format data in LLM-friendly structures
Pages without custom LLM files automatically generate clean text versions using a custom output format. The implementation works as follows:
Hugo Configuration (config.toml
):
The custom content generation system requires three key configuration sections:
# Custom output formats
[mediaTypes]
[mediaTypes."text/plain"]
suffixes = ["txt"]
[mediaTypes."text/markdown"]
suffixes = ["md"]
[outputFormats]
[outputFormats.llmsfull]
mediaType = "text/plain"
baseName = "llms-full"
isPlainText = true
notAlternative = true
[outputFormats.llms]
mediaType = "text/plain"
baseName = "llms"
isPlainText = true
notAlternative = true
[outputs]
home = ["HTML", "RSS", "llmsfull"]
page = ["HTML", "llms"]
Configuration Breakdown:
mediaTypes
: Defines the MIME types Hugo recognizes. Thetext/plain
type withtxt
suffix enables plain text file generation.outputFormats
: Creates two custom output formats:llms
: Generates individualllms.txt
files for each pagellmsfull
: Creates the comprehensive site index asllms-full.txt
The
isPlainText = true
flag ensures proper text formatting, whilenotAlternative = true
prevents these from appearing in RSS feeds or sitemaps.outputs
: Specifies which formats to generate:home
: The homepage generates HTML, RSS, and the full LLM indexpage
: Individual pages generate HTML and LLM-friendly versions
Template (layouts/_default/single.llms.txt
):
{{- if .File -}}
{{- $customLlmsFile := printf "%sllms.txt" .File.Dir -}}
{{- if not (fileExists (printf "content/%s" $customLlmsFile)) -}}
# {{ .Title }}
{{ if .Params.description }}{{ .Params.description }}
{{ end }}{{ if .Date }}Published: {{ .Date.Format "January 2, 2006" }}
{{ end }}{{ .RawContent }}
{{- end -}}
{{- end -}}
This template only generates content when no custom llms.txt
file exists, ensuring the custom versions always take precedence.
The crown jewel is llms-full.txt
- a comprehensive index of all LLM-friendly content on the site, organized by content type:
Template (layouts/index.llmsfull.txt
):
# LLM-Friendly Content Index
# Info
{{- range (where .Site.RegularPages "Type" "info") }}
- [{{ .Title }}]({{ .Permalink }}llms.txt)
{{- end }}
# Posts
{{- range (where .Site.RegularPages "Type" "posts") }}
- [{{ .Title }}]({{ .Permalink }}llms.txt)
{{- end }}
# Projects
{{- range (where .Site.RegularPages "Type" "projects") }}
- [{{ .Title }}]({{ .Permalink }}llms.txt)
{{- end }}
# IndieWeb Notes
{{- range (where .Site.RegularPages "Type" "indieweb") }}
{{- if eq .Params.kind "note" }}
- [{{ .Title }}]({{ .Permalink }}llms.txt)
{{- end }}
{{- end }}
Smart Content Detection
The system automatically detects which pages have LLM versions available and shows links accordingly. On individual pages, the template checks for content and displays the link in the page header:
{{/* Always show llms.txt link - either custom or auto-generated */}}
{{ if .File }}
<a href="{{ .Permalink }}llms.txt" class="llms-link">
<span class="llms-letter">l</span><span class="llms-letter">l</span>...
</a>
{{ end }}
The Result
Every page now has an LLM-friendly version accessible at /page-url/llms.txt
, with a comprehensive index at /llms-full.txt
. The system automatically maintains itself as new content gets added, while providing full control over the most important pages.
The complete implementation is available in the site’s source code, and the system generates this very post in LLM-friendly format automatically. The LLM version of this post and complete site index demonstrate the system in action.
Comments