llms.txt, Markdown Twins, and Cite-As Headers: Which Machine Surfaces Actually Matter
An honest ranking of the machine-readable surfaces brands are shipping for AI visibility — including the one that gets almost no traffic but is still worth it.
There is now a small zoo of machine-readable surfaces a brand can ship for AI engines: llms.txt, Markdown page variants, JSON-LD, citation headers, IndexNow. Some matter a lot. One is nearly pointless by traffic numbers and still worth shipping. Here is the honest ranking.
1. Structured data (JSON-LD) — ship it everywhere
Still the workhorse. QAPage on answer pages, DefinedTerm on glossary and entity pages, ItemList on listicles, Dataset and Claim graphs on data artifacts. Engines and their retrieval pipelines parse it reliably, and it disambiguates what kind of thing a page is — which decides which questions the page can be cited for.
2. Markdown twins of citable artifacts
If your product produces an artifact people cite — a report, a result, a dataset — serve a Markdown version at the same URL plus .md, and advertise it with a Link rel="alternate" type="text/markdown" header plus a Cite-As header on the HTML page. Agents fetching your page to quote it get a clean, token-cheap version with no chrome. Pancake does this for every strategy result: the canonical page, plus a Markdown twin one request away, both carrying the same reproducibility hashes. Its methodology page explains what those hashes pin.
3. IndexNow — because Bing feeds ChatGPT
ChatGPT Search citations track the Bing index closely. IndexNow is the fastest way to tell Bing about new content, it is free, and it is one script. Submit after every content deploy. This is the highest leverage-per-minute item on the list.
4. llms.txt — tiny traffic, still worth it
Honest numbers first: across measured sites, llms.txt accounts for a tiny fraction of AI-bot traffic — fractions of a percent. It is not how mainstream assistants discover you today. Ship it anyway if your audience includes agent builders: it is the convention agents check when a developer points one at your product, it costs an afternoon, and it doubles as a forcing function — if you cannot write a clean llms.txt, your site structure is the problem. The useful trick: derive it from the same data files that drive your sitemap, so it can never go stale.
The pattern underneath
Every item above is the same move: meet the machine reader halfway, and never let the machine surface drift from the human one. Generate both from one source of truth.
