Meet the AI Crawlers Reading Your Site

Not all bots are search engines anymore. A fast-growing share of crawler traffic now comes from answer engines — the systems behind ChatGPT, Claude, Perplexity, and Google's AI features — fetching your content to train models and answer questions in real time. Here's who they are and what they actually request.

The answer engines

  • GPTBot (OpenAI) — trains and indexes content for ChatGPT.
  • OAI-SearchBot / ChatGPT-User (OpenAI) — real-time fetches when ChatGPT browses or grounds an answer.
  • ClaudeBot (Anthropic) — the training crawler; Claude-User fetches live when someone asks Claude about a page.
  • PerplexityBot / Perplexity-User — index and live-search fetches for Perplexity's answer engine.
  • Google-Extended — Google's opt-in signal for Gemini training, distinct from classic Googlebot.

What they fetch

Some bots want your rendered HTML. Increasingly, the well-behaved ones look for machine-readable versions first: an llms.txt index, clean Markdown renderings of your posts, or JSON-LD structured data. Serving those formats makes your content cheaper to parse — and more likely to be cited accurately.

How to see it

You can't manage what you can't measure. AI Bot Log aggregates anonymized bot-visit data from sites across the network into a live dashboard, so you can see which crawlers are active, what they request, and how the trends are moving — instead of guessing from raw server logs.