-
Notifications
You must be signed in to change notification settings - Fork 157
Closed
Labels
Description
During fast_match
, drain always iterates over all possible clusters and updates their access time in the cache. This leads to two problems:
- The update slows down the performance
- Even clusters that will never match anymore will never be removed from cache
Expected behavior:
Cluster will only be updated/touched in cache after they were actual used/chosen. There is actually a comment for this in the source code already:
Try to retrieve cluster from cache with bypassing eviction algorithm as we are only testing candidates for a match.
https://github.com/IBM/Drain3/blob/15470e391caed9a9ef5038cdd1dbd373bd2386a8/drain3/drain.py#L217