Meet the AI Crawlers Reading Your Site
Not all bots are search engines anymore. A fast-growing share of crawler traffic now comes from answer engines — the systems behind ChatGPT, Claude, Perplexity, and Google's AI features — fetching your content to train models and answer questions in real time. Here's who they are and what they actually request.
The answer engines
- GPTBot (OpenAI) — trains and indexes content for ChatGPT.
- OAI-SearchBot / ChatGPT-User (OpenAI) — real-time fetches when ChatGPT browses or grounds an answer.
- ClaudeBot (Anthropic) — the training crawler; Claude-User fetches live when someone asks Claude about a page.
- PerplexityBot / Perplexity-User — index and live-search fetches for Perplexity's answer engine.
- Google-Extended — Google's opt-in signal for Gemini training, distinct from classic Googlebot.
What they fetch
Some bots want your rendered HTML. Increasingly, the well-behaved ones look for machine-readable versions first: an llms.txt index, clean Markdown renderings of your posts, or JSON-LD structured data. Serving those formats makes your content cheaper to parse — and more likely to be cited accurately.
How to see it
You can't manage what you can't measure. AI Bot Log aggregates anonymized bot-visit data from sites across the network into a live dashboard, so you can see which crawlers are active, what they request, and how the trends are moving — instead of guessing from raw server logs.