Machine-Readable Infrastructure

X-Robots-Tag Patterns

Last reviewed:

Add as an HTTP response header via server config, CDN rules, or edge worker. This is the only option for non-HTML resources — meta robots cannot appear in a PDF or image file.

# ── APACHE (.htaccess) ──────────────────────────────────────────────────────

# Block indexing on a directory
<Directory "/var/www/html/private">
  Header set X-Robots-Tag "noindex, nofollow"
</Directory>

# Block snippet extraction on PDF files (also suppresses AI extraction)
<FilesMatch "\.pdf$">
  Header set X-Robots-Tag "nosnippet"
</FilesMatch>

# Limit snippet length to 150 characters
<FilesMatch "\.(html|htm)$">
  Header set X-Robots-Tag "max-snippet:150"
</FilesMatch>

# ── NGINX ────────────────────────────────────────────────────────────────────

# Block indexing on a location block
location /private/ {
  add_header X-Robots-Tag "noindex, nofollow";
}

# Block snippet extraction on PDFs
location ~* \.pdf$ {
  add_header X-Robots-Tag "nosnippet";
}

# ── AVAILABLE DIRECTIVES ─────────────────────────────────────────────────────
# noindex               -- exclude from search index
# nofollow              -- do not follow links on the page
# noarchive             -- no cached copy shown in results
# nosnippet             -- no text snippet or video preview
# max-snippet:N         -- limit snippet to N characters; 0 = nosnippet
# max-image-preview:[none|standard|large] -- limit image preview size
# max-video-preview:N   -- limit video preview to N seconds
# noimageindex          -- do not index images on this page

Field notes

  • The only option for non-HTML resources — meta robots cannot appear in a PDF or image file.
  • nosnippet also suppresses AI Overview text extraction, not just traditional snippet display.
  • The header must be served with a 200 response to be honored — headers on redirected or error responses may not propagate.
  • Verify with curl -I https://example.com/page and check the response headers.
  • CDN edge caches may strip or modify headers before they reach the crawler — test from origin and from edge.