Skip to content

Import fails when Referrer field is not present using Regex format #335

@kuzi-moto

Description

@kuzi-moto

Hello, I am running into an issue where the import fails when trying to process a log entry that is missing the referrer field. The server outputs the logs in JSON, and if there is no referrer just omits the referrer field from the output. I have worked around this by designing my regex to account for this field possibly being missing.

However, it appears that the script will still try to process the field but then exits since it's missing. Here's the command run and the output, notice I'm running an image built from the python image and have mounted the Matomo data and the server logs within the container.

sudo docker exec matomo-import-logs python /tmp/matomo/misc/log-analytics/import_logs.py --url=https://matomo.example.com --token-auth=<token> --add-sites-new-hosts --enable-bots --enable-http-errors --log-format-regex="{\"ClientHost\":\"(?P<ip>\d+\.\d+\.\d+.\d+)\",\"\w+\":\"(?P<userid>.+?)\",\"\w+\":(?P<length>\d+),\"\w+\":(?P<status>\d+),\"\w+\":\d+,\"\w+\":\"(?P<host>.+?)\",\"\w+\":\"(?P<method>\w+)\",\"\w+\":\"(?P<path>.+?)\",\"StartLocal\":\"(?P<date>\d+-\d+-\d+T\d+:\d+:\d+)\.\d+(?P<timezone>-\d+:\d+)\",\"\w+\":\"\w+\",\"\w+\":\"[^\"]*\",(?:\"request_Referer\":\"(?P<referrer>[^\"]+)?\",)?\"\w+-\w+\":\"(?P<user_agent>[^\"]+)\",\"\w+\":\"[^\"]+\"}"
 --log-date-format="%Y-%m-%dT%H:%M:%S" -dd /var/log/traefik/access.log
[sudo] password for user:
2022-05-15 04:01:43,814: [DEBUG] Accepted hostnames: all
2022-05-15 04:01:43,816: [DEBUG] Matomo Tracker API URL is: https://matomo.example.com
2022-05-15 04:01:43,816: [DEBUG] Matomo Analytics API URL is: https://matomo.example.com
2022-05-15 04:01:43,816: [DEBUG] Authentication token token_auth is: <token>
2022-05-15 04:01:43,816: [DEBUG] Resolver: dynamic
2022-05-15 04:01:43,817: [DEBUG] Launched recorder
Traceback (most recent call last):
  File "/tmp/matomo/misc/log-analytics/import_logs.py", line 2688, in <module>
    main()
  File "/tmp/matomo/misc/log-analytics/import_logs.py", line 2654, in main
    parser.parse(filename)
  File "/tmp/matomo/misc/log-analytics/import_logs.py", line 2487, in parse
    if hit.referrer.startswith('"'):
AttributeError: 'NoneType' object has no attribute 'startswith'

Seems like this could be resolved easily enough by adding a test to skip referrer if it's not present. I might be able to submit a pull request in the next few days if I figure out enough Python to write it myself but maybe someone smarter will be able to it before then. Thanks for any assistance!

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions