Skip to content

Today archive is invalidated too often #15646

@tsteur

Description

@tsteur

refs #15616

While #15616 fixes some invalidation issues when a report is viewed from the UI, this issue is about the cronarchiver itself which invalidates reports here: https://github.com/matomo-org/matomo/blob/3.13.3/core/CronArchive.php#L986

This means we're invalidating all reports every time a site or segment is archived.

  • Meaning we potentially invalidate the same reports over and over again every few seconds or minutes (depending how long it takes to finish 1 archive)
  • causing the tracker cache to be invalidated each time as well
  • causing the tracker cache to be recreated on multiple tracking requests that come in at the same time
  • causing on high traffic sites multiple tracking requests to invalidate the report https://github.com/matomo-org/matomo/blob/3.x-dev/core/Archive/ArchiveInvalidator.php#L117-L119
  • As soon as one archive finishes in the CronArchiver and the same thing happens again when the CronArchiver starts archiving the next period, or segment, or site. It's even worse if multiple archivers run at the same time and do these things in parallel.

We've probably always had this issue for Matomo installations that were tracking requests for previous days. But it became more visible with recent optimisations in CronArchiver such as not launching the archive request for sites that had no tracking request since the last archive. This resulted in the need to mark today's archives as done immediately, and invalidate them as soon as a tracking request comes in. Before this, tracking requests for today did not cause any invalidation but now it does. There were also a few other archive improvements which caused these side effects now by the looks.

Not sure how we can improve things. We could call $this->invalidateArchivedReportsForSitesThatNeedToBeArchivedAgain() only max every time_before_PERIOD_archive_considered_outdated seconds but this might not be a proper fix. Not sure how we can fix this.

Ideally, we'd simply no longer invalidate reports for today. This makes it slower when someone has thousands of sites, where many sites have no traffic. That's quite edge case though and maybe doesn't justify these problems.

Metadata

Metadata

Assignees

Labels

c: PerformanceFor when we could improve the performance / speed of Matomo.

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions