Question 1

What is Bytespider?

Accepted Answer

Bytespider is a web crawler operated by ByteDance, the parent company of TikTok. It gathers web content to train ByteDance's large language models. It is widely reported as one of the most aggressive AI crawlers by request volume.

Question 2

What is the Bytespider user-agent string?

Accepted Answer

The documented user-agent string is: Mozilla/5.0 (compatible; Bytespider; spider-feedback@bytedance.com)

Question 3

Does Bytespider respect robots.txt?

Accepted Answer

Undocumented, and disputed in practice. ByteDance publishes no crawling policy, so there is no official robots.txt statement either way. Multiple independent site owners and security vendors report Bytespider continuing to crawl paths that are disallowed in robots.txt, so a robots.txt rule alone is widely considered insufficient.

Question 4

How do I block Bytespider in robots.txt?

Accepted Answer

Add these two lines to your robots.txt: "User-agent: Bytespider" followed by "Disallow: /". To explicitly allow it instead, use "Allow: /".

Bytespider

What is Bytespider?

The Bytespider user-agent string

How do I block Bytespider in robots.txt?

Block Bytespider

Allow Bytespider

Does Bytespider respect robots.txt?

Should you block Bytespider?

Official documentation

What does your robots.txt say about Bytespider?