-
Notifications
You must be signed in to change notification settings - Fork 1.2k
genpolicy: prevent corruption of the layer cache file #11426
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
genpolicy: prevent corruption of the layer cache file #11426
Conversation
cc @danmihai1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@charludo , thanks for pointing out these issues!
To be honest, I am not entirely certain how exactly the corruption of the file occurs, since only one concurrent write should be possible in the current implementation. Maybe someone here has deeper insight into this.
My understanding is that our OpenOptions::new().write(true).open + writer.flush were not truncating the output file, when the json output was smaller than the file previously grown by a concurrently running genpolicy instance.
Given that:
- Synchronizing multiple readers and writes is tricky as we discussed above, dependent on the behavior of buffering at various levels, etc.
- Reading and writing this file over and over again is relatively expensive.
- Executing multiple copies of genpolicy concurrently is mostly a test use case.
I think we should simplify the design:
- try_lock_shared() at the beginning of genpolicy execution. If locking failed: disable layer caching for the current genpolicy instance and cancel all the steps below.
- Read the file a single time in memory. If reading fails, start with an empty set of cached layers.
- unlock().
- Add any layers into the memory of the current genpolicy instance - but don't write these changes into the file yet.
- Before successful exit: if there were any changes to the in-memory cached layers:
a. try_lock_exclusive(). If locking failed: discard any layer changes from current instance's memory, and cancel the steps below.
b. Truncate file to size 0.
c. Flush the cached contents from current instance's memory into the file.
d. unlock().
Thoughts?
I would rather avoid the complexity of NamedTempFile - hopefully the simplified algorithm I mentioned above is good enough.
You are more than welcome to make these changes if you want to. Otherwise, either I or one of my colleagues will make the changes.
That does make a lot of sense! Edit: took a shot at implementing the suggestions. |
c2a3fa0
to
432ff1d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great to me - thanks @charludo! I will start the CI tests.
@charludo when you'll get a chance, please try to address the test failures marked "Required" in the list below. Let me know in case you'll need help with these failures. |
432ff1d
to
fb41824
Compare
The locking mechanism around the layers cache file was insufficient to prevent corruption of the file. This commit moves the layers cache's management in-memory, only reading the cache file once at the beginning of `genpolicy`, and only writing to it once, at the end of `genpolicy`. In the case that obtaining a lock on the cache file fails, reading/writing to it is skipped, and the cache is not used/persisted. Signed-off-by: charludo <git@charlotteharludo.com>
6af26b1
to
4e57cc0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, thanks!
We have been experiencing sporadic corruption of the layer cache file. Applying
genpolicy --use-cached-files
to thisdeployment.yaml
and initial state of the layer cachelayers-cache.json
:reproduces the error by generating this corrupted
layers-cache.json
in version3.16.0
ofkata-containers
:Following 349ce8c this example no longer exhibits the behavior. In the linked commit, the user and group parsing was refactored and improved.
However, for very long strings (in otherwise valid cache files), the same behavior can still be reproduced on the tip of main.
An (ugly) script to generate such a file:
The changes in this PR fix the issue. The root cause is that
fs2
only obtains a file lock during write. Since it's an exclusive lock, it should prevent concurrent writes; however, at other points in the code, no lock is acquired to read (or delete) the file. Sincefs2
implements advisory locks (throughflock
on UNIX systems), different threads/processes are still able to read/delete while the lock is supposed to be held elsewhere.A large number of users and groups seems to take sufficient time to parse to reproduce this error reliably.
To be honest, I am not entirely certain how exactly the corruption of the file occurs, since only one concurrent write should be possible in the current implementation. Maybe someone here has deeper insight into this.