Skip to content

Performance regression in tracker for high traffic website #2944

@anonymous-matomo-user

Description

@anonymous-matomo-user

i have also a problem with very high load.

on my mysql server i see lots of querys like the following one, and each one is running over 1 minute ....
SELECT idvisitor,
visit_last_action_time,
visit_first_action_time,
idvisit,
visit_exit_idaction_url,
visit_exit_idaction_name,
visitor_returning,
visitor_days_since_first,
visitor_days_since_order,
location_country,
location_continent,
referer_name,
referer_keyword,
referer_type,
case when idvisitor = 'W?O????' then 1 else 0 end AS priority,
visitor_count_visits,
visit_goal_buyer

            , custom_var_k1, custom_var_v1,
            custom_var_k2, custom_var_v2,
            custom_var_k3, custom_var_v3,
            custom_var_k4, custom_var_v4,
            custom_var_k5, custom_var_v5
            FROM piwik_log_visit WHERE visit_last_action_time >= '2012-02-16 13:39:17'
                AND idsite = '3' AND (idvisitor = 'W?O????' OR config_id = '^?j?_6?') 
            ORDER BY priority DESC, visit_last_action_time DESC
            LIMIT 1

i have broken it down to the fact that according to an explain there are thousands of estimated rows:

*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: piwik_log_visit
type: ref
possible_keys: index_idsite_config_datetime,index_idsite_datetime,index_idsite_idvisitor
key: index_idsite_datetime
key_len: 4
ref: const
rows: 145600
Extra: Using where; Using filesort

the complete piwik_log_visit table have 781971 entries in total

i played a little bit with the select statement and ended up with an interresting observation, after i told the select to ignore all indexes, the statement runs in 0,8 seconds "IGNORE INDEX (index_idsite_idvisitor,index_idsite_datetime,index_idsite_config_datetime)" which is a more acceptable time and got my server again up and running, but still is not optimal.

i don't understand why this query much better without indexes as with the given ones ...

I have found a changeset that might have changed the behavior to this: r5531

i think one of the problems is that with the OR in the WHERE part the indexes didn't match as good and the second is that for the ordering on priority the server will have examine each row for the case on idvisitors, which will take its time with that much estimatet rows ... :(

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugFor errors / faults / flaws / inconsistencies etc.MajorIndicates the severity or impact or benefit of an issue is much higher than normal but not critical.

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions