-
Notifications
You must be signed in to change notification settings - Fork 527
Inode cache for file hashes #577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Interesting! I unfortunately don't have much time to think about this right now, though. |
Think I missed something here, why would the inode change when file is updated ? We have investigated reusing hash contents before, that time using memcached. Here is some old code, with "stat": 3.7-maint...afbjorklund:memcached-hashed It has a great win, but same downfall as the other sloppy options for a silver bullet. |
I will try to explain this way. When calling hash_source_code_file() to hash "includes/foo.h", the result depends on the content of the specified file. It also depends on the provided seed and configuration options. Caching the result based on (filename, mtime) would be a terrible idea since "includes/foo.h" is relative and could resolve to different files depending on current directory. It could also resolve to different files if files have been moved around or the directory structure has changed. As (device, inode) makes the identity of a file, and mtime is always updated if the content is changed, then (device, inode, mtime) must resolve to the same content from time to time, otherwise the system violates assumptions made by other build tools like "make". The saved hash value is made independent of the provided seed by always hashing files using a new seed, saving the hash digest from that, and then hashing the saved hash digest to the provided seed instead. Thus using the hash-combine trick to make saved hash values context independent. Configuration settings that change the hash result (currently sloppy_time_macros) must be added to the key though, which means (device, inode, mtime, sloppy_time_macros) will be used as combined key that identifies a saved entry in the cache.
It is always a tradeoff between generic solutions that solve generic problems and homebrewed solutions that makes exactly what you need, and not more than that. I guess memcached is not zero-overhead because it runs in a seperate process and we have to communicate over a socket, causing context switching? I also guess integrating with memcached requires about the same amount of code as doing it all ourselves? And that memcached also requires additional administrative work for end users? Please correct me if I'm wrong.
In this case a single file is mapped into shared memory by concurrent processes. This is basically how libc.so and other shared libraries are loaded every time a ccache process is spawned. Nothing would be fast if the kernels couldn't handle this very effeiciently. The easiest way to find out is to build and test the patch yourself. Either it is fast or it isn't.
Exactly what downfalls do you see? And exactly how is it sloppy? |
Ah, OK. Used the absolute path as key for the same thing (as the device, inode) then. Using memcached back then was natural since we already used it for distributed cache.
It has the usual mtime caveats, if you change content and reset timestamp you lose... The "checksum always" strategy was the safe-and-slow option. But as opt-in, sure! |
In the actual implementation I added ctime and size to the key, thus tightening such loopholes since you can't modify mtime without increasing ctime. I didn't mention it since I didn't want to complicate things with all details when explaining how it worked in principle.
You can always violate contracts by editing the actual disk blocks as a last resort. The point is that other parts of a build system also make the same assumptions and if you violate them ccache will not be the only build tool that breaks the chain. The feature is not sloppy in the way that it will break by accident when used on a healthy system. If you have to be creative to break it, then it is not a flaw that will hit innocent users. |
It was not meant sloppy as in dirty, and there is an inherit risk in all caching... Like you say, if the setup is healthy then it should be reliable. Hope that someone gets a chance to look at it! |
Thanks for the numbers. Not that it matters much, but it would be interesting to know how much of the 7-8 minutes come from avoiding hashing and how much come from avoiding I/O. By the way, do you use the
This indeed sounds like a much better approach than the path-based one envisioned in #377. Thanks! I have nothing against this approach in principle. Well, maybe the only concern I have is if your use case is "non-edge" enough to motivate increasing the ccache implementation complexity. |
Seems to be from reduced hashing. Before implementing the cache I used valgrind/callgrind for profiling and found about 60-70% of cpu time was spent in hashing, while most of this time got eliminated by the cache. The mentioned build was performed on a machine with 128 GB RAM, which effectively eliminates all reads from physical disk since all source files and all output fit into buffer space. It doesn't mean I/O is for free, but there is no latency while waiting for disk.
None of them, mainly because I/O pressure looks very low in general when I build, provided sufficient RAM and fast NVMe drives. At least I have not seen any measurable gain when trying these options in the past.
I think the important question is if ccache performance matters when you get a hit, or if hits are already fast enough? If it matters it might be hard to find other changes that achieves similar improvements, while not affecting complexity even worse. Note that also trivial compilation units such as the "Hello World!" program will gain from the cache, because a single standard header often includes a huge amount of other files.
Which is 87% faster even for a trivial program. :-) Thanks for the review comments btw. Will look at them when I get some spare time the coming week. |
You can also use the built-in ccache tracing, to get something you can load into chrome: #280
|
The inode cache is a process shared cache that maps from device, inode, mtime to saved hash results. The cache is stored persistently in a single file that is mapped into shared memory by running processes, allowing computed hash values to be reused both within and between builds. The chosen technical solution works for Linux and might work on other POSIX platforms, but is not meant to be supported on non-POSIX platforms such as Windows. Use 'ccache -o inode_cache=true/false' to activate/deactivate the cache. Use 'ccache -o inode_cache_file=/path/to/cache/file" to set a custom cache-file location. Defaults to "{cache-dir}/inode-cache".
Co-authored-by: Joel Rosdahl <joel@rosdahl.net>
Co-authored-by: Joel Rosdahl <joel@rosdahl.net>
Co-authored-by: Joel Rosdahl <joel@rosdahl.net>
Co-authored-by: Joel Rosdahl <joel@rosdahl.net>
Co-authored-by: Joel Rosdahl <joel@rosdahl.net>
Co-authored-by: Joel Rosdahl <joel@rosdahl.net>
Co-authored-by: Joel Rosdahl <joel@rosdahl.net>
Co-authored-by: Joel Rosdahl <joel@rosdahl.net>
Co-authored-by: Joel Rosdahl <joel@rosdahl.net>
Co-authored-by: Joel Rosdahl <joel@rosdahl.net>
Co-authored-by: Joel Rosdahl <joel@rosdahl.net>
Co-authored-by: Joel Rosdahl <joel@rosdahl.net>
Co-authored-by: Joel Rosdahl <joel@rosdahl.net>
Co-authored-by: Joel Rosdahl <joel@rosdahl.net>
Co-authored-by: Joel Rosdahl <joel@rosdahl.net>
Apparently posix_fallocate() needs read permission to the open file, otherwise it will not work on nfs since the emulation is client side.
…issues" This reverts commit 5a55739.
Thanks! |
Introduced in ccache#577 (213d988).
Unintended or not, ccache#577 (213d988) changed the behavior of “ccache --hash-file” to use hash_binary_file, which essentially performs hash(hash(path)) if the i-node cache is enabled, otherwise hash(path). This means that “ccache --hash-file” behaves differently depending on if i-node cache is enabled and also that it’s no longer usable for benchmarking purposes. Fix this by simply using “hash_file” again.
Introduced in ccache#577 (213d988).
Unintended or not, ccache#577 (213d988) changed the behavior of “ccache --hash-file” to use hash_binary_file, which essentially performs hash(hash(path)) if the i-node cache is enabled, otherwise hash(path). This means that “ccache --hash-file” behaves differently depending on if i-node cache is enabled and also that it’s no longer usable for benchmarking purposes. Fix this by simply using “hash_file” again.
Unintended or not, #577 (213d988) changed the behavior of “ccache --hash-file” to use hash_binary_file, which essentially performs hash(hash(path)) if the i-node cache is enabled, otherwise hash(path). This means that “ccache --hash-file” behaves differently depending on if i-node cache is enabled and also that it’s no longer usable for benchmarking purposes. Fix this by simply using “hash_file” again.
Bumping the discussion in #377 with a proof of concept.
IMHO the file_stat_matches sloppiness option is not useful for several reasons.
Time stamps are updated when switching between branches, they don't match when files are checked out into different worktrees, they cannot be trusted alone as manifests contain relative file paths. And all included files will still need to be hashed for each compilation unit that uses them.
Hard facts from building a large program, taking the chromium browser as an example. With the current approach over 18 million files will be hashed to build about 34000 compilation units. With the suggested patch this goes down to hashing about 74000 files in the initial build and 26500 files in repeated builds from a hot ccache. This saves 7-8 minutes in serial build time for me, which is significant, although the real time gain for parallel builds is smaller and depends on the number of cores by natural reasons.
Note that this is a draft pull request with a patch I wrote just for the sake of discussion. Not sure if this is the best way to do it, but it demonstrates a cache can be efficient with close to zero overhead and as an embedded feature, thus not introducing new dependencies on external infrastructure such as caching daemons.
========================================================
The inode cache is a process shared cache that maps from device, inode,
mtime to saved hash results. The cache is stored persistently in a
single file that is mapped into shared memory by running processes,
allowing computed hash values to be reused both within and between
builds.
The chosen technical solution works for Linux and might work on other
POSIX platforms, but is not meant to be supported on non-POSIX
platforms such as Windows.
Use 'ccache -o inode_cache=true/false' to activate/deactivate the cache.
Use 'ccache -o inode_cache_file=/path/to/cache/file" to set a custom
cache-file location. Defaults to "{cache-dir}/inode-cache".