Skip to content

Conversation

uenoku
Copy link
Member

@uenoku uenoku commented Jun 19, 2025

Updated cache cleanup system so that we can remove small caches (likely from failed builds) and maintain only the most recent successful builds. Previously cache is saved even when the step failed (hendrikmuhs/ccache-action#149) so cache hit rate dramatically decreases after the compilation failure in the early stage (via trivial error or manual cancelation).

This commit adds new cleanupCache step (renamed from garbageCollection step since I thought cleanup is better name).
This implementation uses a dynamic size threshold of 1/2 of maximum cache size as the cleanup threshold. It employs regex pattern matching to find caches with .*PATTERN.* for flexible matching. The system keeps the top 1 cache to preserve the most recent successful build and is safe by default with dry-run mode enabled to prevent accidental deletions.

The changes add a reusable cleanup workflow at .github/workflows/cleanupCache.yml and integrate cache cleanup into unified build workflows. Cache cleanup has been added to short integration tests with cache key pattern generation and reuse implemented.

The cache cleanup logic calculates the threshold as 50% of largest cache size, then deletes all caches smaller than the threshold (failed builds) and deletes caches beyond top 1 that exceed the threshold (old builds). It uses regex matching for flexible cache key patterns.

The workflows updated include unifiedBuildTestAndInstall.yml, unifiedBuildTestAndInstallStatic.yml, and shortIntegrationTests.yml and windowsCI.

@uenoku uenoku requested a review from teqdruid as a code owner June 19, 2025 04:53
@uenoku uenoku requested a review from seldridge June 19, 2025 04:57
Copy link
Member

@seldridge seldridge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good with trying this out.

A possibly simpler alternative would be to gate the cleanup on if: failure() and then delete exactly the cache that was created for this build. This requires passing information between jobs which may be tricky.


# Calculate the size threshold as 1/2 of the maximum cache size
MAX_SIZE=$(echo "$MATCHING_CACHES" | jq 'map(.sizeInBytes) | max // 0')
SIZE_THRESHOLD=$((MAX_SIZE / 2))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised this just works? I thought you would need to use dc or bc to do math in a shell script?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think posix shell supports (( )) for calculating single arithmetic expression( 2.6.4 https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html)

else
echo "Failed to delete cache ID: $id" >&2
fi
sleep 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the sleep for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is rate limit for github API so just making sure it doesn't call overwhelming number of API calls. Though it wouldn't be a problem practically.

uenoku added 3 commits June 29, 2025 15:13
Implement automated cache cleanup system that removes small caches (likely
from failed builds) and maintains only the most recent successful builds.

This implementation uses a dynamic size threshold of 1/2 of maximum cache size
as the cleanup threshold. It employs regex pattern matching to find caches with
`.*PATTERN.*` for flexible matching. The system keeps the top 1 cache to preserve
the most recent successful build and is safe by default with dry-run mode enabled
to prevent accidental deletions.

The changes add a reusable cleanup workflow at .github/workflows/cleanupCache.yml
and integrate cache cleanup into unified build workflows. Cache cleanup has been
added to short integration tests with cache key pattern generation and reuse
implemented.

The cache cleanup logic calculates the threshold as 50% of largest cache size,
then deletes all caches smaller than the threshold (failed builds) and deletes
caches beyond top 1 that exceed the threshold (old builds). It uses regex
matching for flexible cache key patterns.

The workflows updated include unifiedBuildTestAndInstall.yml,
unifiedBuildTestAndInstallStatic.yml, and shortIntegrationTests.yml.
Don't delete cache in main
@uenoku uenoku force-pushed the dev/hidetou/cache-imporve branch from cf5ed69 to 5cfdbb9 Compare June 29, 2025 22:23
@uenoku uenoku force-pushed the dev/hidetou/cache-imporve branch from 8f81584 to 0d9aa45 Compare June 29, 2025 23:27
@uenoku uenoku force-pushed the dev/hidetou/cache-imporve branch from ab9ff3d to 6866aa0 Compare June 30, 2025 08:57
Drop nightly integration test change because docker
image doesn't have jq
@uenoku uenoku merged commit 3d8a532 into main Jun 30, 2025
7 checks passed
@uenoku uenoku deleted the dev/hidetou/cache-imporve branch June 30, 2025 09:46
TaoBi22 pushed a commit to TaoBi22/circt that referenced this pull request Jul 17, 2025
Updated cache cleanup system so that we can remove small caches (likely from failed builds) and maintain only the most recent successful builds. Previously cache is saved even when the step fails so cache hit rate dramatically decreases after the compilation failure in the early stage (via trivial error or manual cancelation).

This commit adds new cleanupCache step (renamed from garbageCollection step since I thought cleanup is better name). 
This implementation uses a dynamic size threshold of 1/2 of maximum cache size as the cleanup threshold. It employs regex pattern matching to find caches with `.*PATTERN.*` for flexible matching. The system keeps the top 1 cache to preserve the most recent successful build and is safe by default with dry-run mode enabled to prevent accidental deletions.

The changes add a reusable cleanup workflow at .github/workflows/cleanupCache.yml and integrate cache cleanup into unified build workflows. Cache cleanup has been added to short integration tests with cache key pattern generation and reuse implemented.

The cache cleanup logic calculates the threshold as 50% of largest cache size, then deletes all caches smaller than the threshold (failed builds) and deletes caches beyond top 1 that exceed the threshold (old builds). It uses regex matching for flexible cache key patterns.

The workflows updated include unifiedBuildTestAndInstall.yml, unifiedBuildTestAndInstallStatic.yml, and shortIntegrationTests.yml and windowsCI.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants