Skip to content

Improve handling and detection of region codes and names #20420

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Jun 6, 2023
Merged

Conversation

sgiehl
Copy link
Member

@sgiehl sgiehl commented Mar 5, 2023

Summary:

This PR provides the following improvements & benefits around regions:

  • Updated list of all iso regions known to Matomo
  • Accurancy of region detection with DB IP Lite database is increased
  • Updating the list of iso regions will no longer cause revised regions to no longer be displayed correctly in older datasets

Description:

The DB-IP lite database, which Matomo is currently using as default, does only provide region names in it's detection (no iso codes).
As Matomo only stores region codes in the database, we are trying to match the returned region name to the region names that Matomo knows about. This might fail if the region name returned by the geoip database doesn't match a known name, which actually seems to be the case quite often.

To improve the region detection, this PR adjusts the lists of known regions. Instead of only holding the latest set of iso region codes mapped to there current name, the array now looks like this:

 <CountryCode> => [
     <RegionCode> => [
         'name' => <CurrentISOName>
         'altNames' => [
             // list of previous names or names used by GeoIP providers like db-ip
         ],
         'current' => <bool> indicating if the iso code is currently used
     ]
]

By providing a list of alternate names it will in the future increase the accurancy when trying to match regions by its name.

To properly fill this new strutuce the command to update the region list has been adjusted. It now also includes an additional option to add names used in db ip database, that don't match known region names.

The region list included in this PR was built with this steps:

  • Initial creation of the region list using an older version of the iso-codes project (so older names and regions are included)
  • Update the list with the latest revision of iso-codes (so we have the latest set as current, but still older names as alternate names)
  • Enrich regions and alternate names with the latest version of db ip databases

fixes #20527
refs #20368

Review

@github-actions
Copy link
Contributor

If you don't want this PR to be closed automatically in 28 days then you need to assign the label 'Do not close'.

@github-actions github-actions bot added the Stale The label used by the Close Stale Issues action label Apr 12, 2023
@sgiehl sgiehl force-pushed the regionnames branch 3 times, most recently from 8a400c5 to a1e14ee Compare May 15, 2023 08:33
@sgiehl sgiehl requested a review from michalkleiner May 30, 2023 10:19
@sgiehl sgiehl marked this pull request as ready for review May 30, 2023 10:19
Copy link
Contributor

@michalkleiner michalkleiner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't able to fully revert the change to isoRegionNames.php and let it regenerate since the content of it and the code relating to it changed gradually.
Since the version of the file is part of the PR here I don't see that as an issue.

Back to you @sgiehl if you want to ask Ben to have a look or merge as is.

@sgiehl sgiehl merged commit 742cbe1 into 5.x-dev Jun 6, 2023
@sgiehl sgiehl deleted the regionnames branch June 6, 2023 11:06
@sgiehl sgiehl added the not-in-changelog For issues or pull requests that should not be included in our release changelog on matomo.org. label Jun 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
not-in-changelog For issues or pull requests that should not be included in our release changelog on matomo.org. Stale The label used by the Close Stale Issues action
Development

Successfully merging this pull request may close these issues.

Regions might not be detected correctly when using free DBIP geolocation database
3 participants