Skip to content

Improve page conversion attribution performance with pre-calculated field #20375

@bx80

Description

@bx80

Summary

Currently the archiving query that calculates conversion attribution for pages (implemented in #2030, revised in #19974) includes an expensive sub-query to calculate the number of pages viewed before conversion.

To improve performance and remove to the need for this sub-query, we can instead calculate the 'number of pages viewed before' value for each conversion at the time the conversion record is created and store it in the log_conversion table in a new unsigned smallint (max value 65,535).

The query can then be adjusted to simply read the value from the log_conversion table instead of using a sub-query which should have a positive effect on performance and temporary table usage.

Retrospectively populating the value of this new field for large existing datasets could be time consuming and unnecessary as archived data will already exist for historic time periods.

The migration that adds this field should calculate historic values only for 'today' and 'yesterday' periods at the time of deployment, but only if there are less than 10,000 conversions in the 24hr period and only if Matomo installation is not hosted on *.matomo.cloud.

To cover cases where historic archives are invalidated and goal page attribution prior to deployment needs to be recalculated a new console command should be added to retrospectively calculate values, eg.
./console core:calc-conversion-pages --dates=2023-03-01,2023-04-01

These changes need to be released as part of Matomo 5.0.0

Refs: L3-313 and L3-402

Metadata

Metadata

Assignees

Labels

EnhancementFor new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc.

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions