-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Description
Summary
With #16795 a robots.txt
has been implemented which breaks tracking of bots. Tracking bots to derive the sites and frequency of search machine crawlers is valuable information, hence bots should by default be tracked like any other visitor. There is even a plugin available to track bots separately based on user agent, which totally lost its purpose with the robots.txt
. I did not fully understand the reason for #16795, but it seems to be based on a single user having issues with google ads submission. However, in case of a single user, I'd say a custom solution makes more sense, than breaking a generally useful feature for all users with a file that is installed and re-created on every Matomo update, which hence cannot be avoided without regular manual interaction.
What makes sense, is preventing the Matomo login page from being indexed, as this is likely never meant to be public, but that is the case already via meta tag, which is generally the better way to do it (preventing crawlers from having to read an additional file every time): https://github.com/matomo-org/matomo/blob/4.x-dev/plugins/Login/templates/loginLayout.twig#L3-L5
In case of Google and Bing, crawlers are still not tracked after removing the robots.txt
, failing on the tracker PHP request. But I didn't investigate this much further without having a confirmation first that tracking bots is generally a wanted feature, which in terms means to remove the robots.txt
that breaks it.