Skip to content

Prevent duplicate dates for the purge old archive data command #20699

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 12, 2023

Conversation

bx80
Copy link
Contributor

@bx80 bx80 commented May 11, 2023

Description:

When using the ./console core:purge-old-archive-data all command the list of archive dates to purge is generated by iterating through the existing archive tables, however since there can be both blob and numeric archive tables this results in each date being added twice to the list. Then every archive purge and table optimization is unnecessarily performed twice.

This PR simply checks the dates array as it is being generated and prevents duplicates date being added.

Although the getTablesArchivesInstalled() method used to retrieve the list of table could be called with a single table type (e.g. just return blob tables) this wouldn't cover scenarios where archive tables only existed of one type for a date.

Fixes #20692

Review

@bx80 bx80 added the 5.0.0 label May 11, 2023
@bx80 bx80 added this to the 5.0.0 milestone May 11, 2023
@bx80 bx80 self-assigned this May 11, 2023
@bx80 bx80 added Bug For errors / faults / flaws / inconsistencies etc. not-in-changelog For issues or pull requests that should not be included in our release changelog on matomo.org. Needs Review PRs that need a code review labels May 11, 2023
@michalkleiner
Copy link
Contributor

michalkleiner commented May 11, 2023

Thanks @bx80, that's a nice little tweak that can potentially make a big difference!

What do you think about an alternative approach of making the list unique just before it's returned? That way it would also cover the other branch of the conditional logic there.

- return $dates;
+ return array_unique($dates);

@sgiehl
Copy link
Member

sgiehl commented May 11, 2023

What @michalkleiner suggested makes sense. And to avoid iterating over numeric and blob tables you could also change it to use
ArchiveTableCreator::getTablesArchivesInstalled(ArchiveTableCreator::NUMERIC_TABLE), that way it would only iterate over numeric tables, which should be enough.

@sgiehl sgiehl removed the Needs Review PRs that need a code review label May 11, 2023
@bx80 bx80 added the Needs Review PRs that need a code review label May 11, 2023
@bx80 bx80 requested a review from michalkleiner May 11, 2023 22:20
@sgiehl sgiehl merged commit bb97c10 into 5.x-dev May 12, 2023
@sgiehl sgiehl deleted the m20692-purge-archive-duplicate-dates branch May 12, 2023 08:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug For errors / faults / flaws / inconsistencies etc. Needs Review PRs that need a code review not-in-changelog For issues or pull requests that should not be included in our release changelog on matomo.org.
Development

Successfully merging this pull request may close these issues.

core:purge-old-archive-data optimizes all tables twice
3 participants