# Because it causes HTTP 406 errors everywhere. User-agent: MJ12bot Disallow: / # Because it causes HTTP 406 errors everywhere. User-agent: SemrushBot Disallow: / # I get ~one hit per second, and it's a constant false positive on my screenscraping detection script. # It claims to do only one hit per 3 seconds, so it seems to run multiple runs in parallel. # Slowing down the crawling, hopefully they will drop below my detection threshold. User-agent: BLEXBot Crawl-delay: 5 # Because it's being an idiot and it ignores the BASE HREF tag. User-agent: PetalBot Disallow: / # Has no use crawling our site but causes screen scraping warnings. User-agent: turnitinbot Disallow: / # False positives on my screenscraping detection script. User-agent: SeekportBot Crawl-delay: 5 # Another false positive. Claims to visit every 5 seconds yet yields a rate of 1.2/sec. User-agent: DataForSeoBot Crawl-delay: 5 # Has no docs, not sure if this'll work. Couldn't find if it supports it. User-agent: search.marginalia.nu Crawl-delay: 5 # Also too fast, rates of 5/sec. User-agent: BaiduSpider Crawl-delay: 5 # Nope. Just nope. Downloads everything and then lets others use it without restriction. User-agent: CCBot Disallow: /