So Baidu’s web crawler is broken. Makes the bad old days of bing bot look positively benign. Wasn’t pushing much load, but lots of log spam and it showed signs of increasing over time.
So, out comes the ban hammer.
Then I thought, why not report their broken bot to them. Should be as simple as an email, or a web page. Sure enough, they have links for filling out forms to indicate that their web crawler is going crazy.
Here is what’s funny about this. Its a failed UX implementation. Not because I can’t read it natively. That is fixable with translate.
The issue is, that in order to report a problem, you need to be registered. And you can’t register without a China mobile number.
So if you are outside of China, or do not have a mobile number within China, you can’t tell Baidu that their search engine web crawler is broken.
So, for now, blocking them at firewall and with a deny clause. I’ll look back in a few weeks and see if the bots are still broken.
If someone from Baidu reads this, your web crawler is broken. Feel free to reach out to me at the day job and I’ll explain. I might even suggest that you make a form for people whom are being hit by a runaway web crawler to report it, without the obvious UX failures that exist now.
Or, we can just ban the bots. Which doesn’t solve the problem, and I don’t like doing that.