-
-
Notifications
You must be signed in to change notification settings - Fork 29
Description
//edit 6/25: vercel added firewall rule for this specifically: https://vercel.com/docs/bot-management - just turn it on, when needed
Over the past several months, we’ve received up to 300k requests on certain days. More traffic—more people enjoying OpenStreetMap, right? Unfortunately, not quite. 🙂
At first, I theorized that some bot (e.g. Googlebot) had started indexing various server-side-rendered (SSR) pages for OSM features—which would have been great. But I also noticed a lot of requests for old static assets (.png, .ico, etc.), which still triggered full SSR.
The application was originally built with a single catchall route. This meant that any request (except for existing static files) would be redirected to the Node.js process, which had to generate the full HTML page. This setup is great for delivering fast experiences to users visiting feature URLs like https://osmapp.org/node/1601837931 and for serving Googlebot meaningful content (rather than just a loading spinner and tons of JavaScript).
However, this catchall route created chaos in our server logs on the Vercel platform. All requests appeared as a single route, with no detail. I realized I had to refactor the code to separate unnecessary 404 pages and better distinguish how many requests were going to the homepage versus specific feature pages. (PR with details)
Another Incident
This Sunday, another 542k requests hit our servers. This time, I stumbled upon the User-agent tab in the Observability section. Turns out the culprit was Claudebot from Anthropic. But another issue emerged—the requests weren’t targeting the main osmapp domain but rather a developer preview from an old pull request: osmapp-git-climbing-tiles-osm-app-team.vercel.app
. Since that code predated the catchall route removal, we still don’t have much insight. But any bot crawling a developer preview is definitely wrong.
I suspect that in previous incidents, it might have crawled the main app as well.
Impact
Due to increased demand for OsmAPP, we had to switch to the paid Vercel plan at €20/month starting last November. Our Pro plan’s monthly limit is now exhausted, but fortunately, additional requests cost only €0.60 per million, which is acceptable.
Solution
I found a way to block bots using this Vercel firewall rule: https://vercel.com/templates/other/block-bad-bots-firewall-rule. Let’s see if usage drops.
I’d also like to block all bots (including Googlebot) on all developer preview deployments. Oddly, no one seems to have addressed this issue. A statically generated Found a working rule for that as well: screenshot.robots.txt
file for non-master
branches might be a good start. I should also remove all outdated deployments that are no longer useful.
Another obstacle is Vercel’s limited log functionality. The extended version is a separately paid feature. It doesn’t feel justifiable to pay another €20 just to know what’s causing harm. I’m considering moving OsmAPP production to my own VPS.