← Back to AI

We are building a parallel web for AI agents

As we’ve mentioned before, agents love Markdown, but they’re still not great at crawling the web. All that JavaScript really messes them up, and even on sites without any JavaScript, it’s wasteful to fill up your agent’s context window with HTML formatting tags when all you really want is the information content.

So in early 2026, Cloudflare, one of the web’s biggest content distributors, started offering Markdown versions of websites. If a site owner opts into this program, then Cloudflare will silently serve a Markdown version of their website when an agent tries to view it. They can do this because some coding agents (like Claude Code and OpenCode) include an Accept: text/markdown header in their HTTP requests. When Cloudflare sees this, they serve up a Markdown version of the web page. Cloudflare serves about 22% of websites, so even as an opt-in beta, the potential reach is significant.

Think about this — it’s a striking shift. If someone comes to your site, and they identify themselves as an agent, they get a Cloudflare-generated summary instead of your actual content! While this shifts power away from content creators, it makes things easier for agents (and the humans who pay their bills). In the example Cloudflare gives in their announcement, the Markdown version of their blog post drops from 16,180 tokens to 3,150 — an 80% reduction.

If you want to have a bit more control, you can create these Markdown summaries yourself. There’s a nice standard at llmstxt.org, proposed by Jeremy Howard in September 2024. Unfortunately, very few people seem to be creating these, and very few agents seem to be requesting them.

But in any case, the message is clear. LLMs are going to be consuming a lot of content, and we need to make it easier for them to do that.

Now, HTML/JavaScript is not the only content that’s hard for LLMs to parse. PDFs are notoriously tricky, and if you’ve ever asked Claude to read some academic papers, you’ll know its ability to parse them is only so-so. Since Claude is so much better with Markdown, I have a Claude pdf-reader skill that tells Claude to download PDFs and then immediately use Marker to convert them to Markdown. So, since I use Claude to help with the research process (and interact with Claude through Markdown notes), my research pipeline is now

markdown -> latex -> pdf -> markdown

This is obviously absurd, and it seems that scientists should start doing what web developers are doing — creating Markdown versions of their papers. Or, at least LaTeX versions?

If every scientific paper had its LaTeX source available on arXiv, Claude would be substantially better as a research assistant. As someone who writes papers, I’m beginning to wonder whether it’s worthwhile creating PDFs of papers in the first place.

If Claude is helping to write the papers, and Claude is helping to review the papers, and Claude is doing the lit review to write the follow-up papers, maybe the papers should all be in Markdown?