Skip to content

Revise segment encoding strategy #17050

@diosmosis

Description

@diosmosis

Currently segments are basically triple url encoded:

  • each value in a condition is encoded once to allow segment operator characters
  • each value is encoded again, because the code server side urldecodes the segment condition (so the value is double encoded at this point)
  • and the entire segment is encoded in the URL and matomo uses this value, not the urldecoded one in $_GET/$_POST (so the value is triple encoded at this point)

All three urldecodes happen in PHP when processing a Segment. The system works, but is overly complicated and in some cases, can introduce subtle bugs. For example, based on whether the correct amount of encoding is used or not in a segment string, the hash will change, which means it's possible to use equivalent segments but get no data from the API.

We could fix this by using a different encoding scheme to encode segment condition values, and then only urlencoding the segment in the URL. For example, we could urlencode values then replace % characters w/ $ as is done in datatable row action parameters. Then in PHP we would use the segment value found in $_GET/$_POST w/o issue.

Backwards Compatibility

To do this w/o breaking BC, we'd have to:

  • still support old triple urlencoded segment values in the URL (preferrably by converting these segment parameters to the proper encoding)
  • and still support looking for triple urlencoded segment archives (where the hash will be different than w/ the new encoding) so we don't have to re-archive everything in the past

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions