Free resource
AI bots directory
These are the 14 AI crawlers that visit ordinary business websites — training crawlers, AI search engines, and assistant fetchers. Each page explains what the bot does, shows its documented user-agent string, and gives you copy-paste robots.txt snippets to allow or block it.
Training crawlers
These bots collect web content to train future AI models. Blocking them keeps your content out of training data — it costs you no traffic, because training crawlers never send visitors.
| Bot | Operator | robots.txt | Details |
|---|---|---|---|
| GPTBot | OpenAI | Documented | |
| ClaudeBot | Anthropic | Documented | |
| CCBot | Common Crawl | Implied | |
| Google-Extended | Opt-out token | ||
| Applebot-Extended | Apple | Opt-out token | |
| Bytespider | ByteDance | Disputed | |
| Meta-ExternalAgent | Meta | Implied |
AI search crawlers
These bots index your site for AI-powered search engines that cite and link back to you. Blocking them makes you invisible exactly where a growing number of people now ask their questions.
| Bot | Operator | robots.txt | Details |
|---|---|---|---|
| OAI-SearchBot | OpenAI | Documented | |
| Claude-SearchBot | Anthropic | Documented | |
| PerplexityBot | Perplexity | Documented | |
| Amazonbot | Amazon | Documented |
Assistant fetchers
These bots fetch a page live when someone asks an AI assistant about it. Blocking them means assistants like ChatGPT and Claude cannot read or cite your pages on demand.
| Bot | Operator | robots.txt | Details |
|---|---|---|---|
| ChatGPT-User | OpenAI | May not apply | |
| Claude-User | Anthropic | Documented | |
| Perplexity-User | Perplexity | May not apply |
Which of these can access your site?
Run your domain through the free checker and see what your robots.txt says about every bot on this page.