Free resource

AI bots directory

These are the 14 AI crawlers that visit ordinary business websites — training crawlers, AI search engines, and assistant fetchers. Each page explains what the bot does, shows its documented user-agent string, and gives you copy-paste robots.txt snippets to allow or block it.

Training crawlers

These bots collect web content to train future AI models. Blocking them keeps your content out of training data — it costs you no traffic, because training crawlers never send visitors.

Bot	Operator	robots.txt
GPTBot	OpenAI	Documented
ClaudeBot	Anthropic	Documented
CCBot	Common Crawl	Implied
Google-Extended	Google	Opt-out token
Applebot-Extended	Apple	Opt-out token
Bytespider	ByteDance	Disputed
Meta-ExternalAgent	Meta	Implied

AI search crawlers

These bots index your site for AI-powered search engines that cite and link back to you. Blocking them makes you invisible exactly where a growing number of people now ask their questions.

Bot	Operator	robots.txt
OAI-SearchBot	OpenAI	Documented
Claude-SearchBot	Anthropic	Documented
PerplexityBot	Perplexity	Documented
Amazonbot	Amazon	Documented

Assistant fetchers

These bots fetch a page live when someone asks an AI assistant about it. Blocking them means assistants like ChatGPT and Claude cannot read or cite your pages on demand.

Bot	Operator	robots.txt
ChatGPT-User	OpenAI	May not apply
Claude-User	Anthropic	Documented
Perplexity-User	Perplexity	May not apply

Which of these can access your site?

Run your domain through the free checker and see what your robots.txt says about every bot on this page.

Check your robots.txt