Fixing Baidu's broken search bot
By joe
- 1 minutes read - 202 wordsIt seems that the bot was generating some effectively random broken URLs. Or maybe not so random. I saw endpoints in the logs that haven’t been in use for at least 7 years. I can’t imagine this was simply a harmless bug, as much as … maybe? … a search for moved/renamed endpoints? As the web server is now done very differently than in the past, the missing endpoints merely generated log spam. And messed our analysis. So I needed a way to fix their code, without … their code. So using our front end server, I marked the specific IP range as being a bad_user.
geo $bad_user {
default 0;
180.76.15.0/24 1;
}
then I told our server to do rewriting when it found the bad_user as a client
if ($bad_user) {
rewrite ^(.*)$ https://scalableinformatics.com;
}
This isn’t redirecting them into a bad place, this is redirecting them to our front page if they have anything other than the front page in the end point. I figure a day of this, and they’ll get the idea that mebbe something is borked in their code/db and clean it up. Annoying that I have to resort to this. Lets see if it helps.