-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Open
Labels
EnhancementFor new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc.For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc.
Milestone
Description
TiDB currently cannot push COUNT(DISTINCT x)
queries to TiFlash so on large datasets these queries can perform slowly and use a lot of memory. Reworking COUNT(DISTINCT x)
queries to use sub-queries instead solves this issue for TiDB, but sub-queries perform ~2.5x slower on MySQL, so a hybrid solution will be required.
In order to support TiDB as an alternative database we need to provide the option for certain queries to be generated using sub-queries instead of COUNT(DISTINCT x)
. This could be a beneficial optimization for other databases too so it would be good to implement in a generic manner.
Suggested approach
- Identify performance sensitive archiving queries that use
COUNT(DISTINCT x)
(33 occurrences in non-test code, mostly incore/DataAccess/LogAggregator
). - Add a new
core/Db/AdaptorInterface
propertypreferSubqueries
and implement it on the standard Matomo MySQL adapter to return false. A future TiDB PDO adapter can return true. - For each performance sensitive query using
COUNT(DISTINCT x)
, rework the query generation code to optionally replaceCOUNT(DISTINCT x)
with an appropriate sub-query if the PDO adapterpreferSubqueries
option is set.
peterbo
Metadata
Metadata
Assignees
Labels
EnhancementFor new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc.For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc.