-
Notifications
You must be signed in to change notification settings - Fork 24
Comparing changes
Open a pull request
base repository: huggingface/xet-core
base: v1.1.0
head repository: huggingface/xet-core
compare: v1.1.1
- 13 commits
- 70 files changed
- 6 contributors
Commits on Apr 29, 2025
-
Configuration menu - View commit details
-
Copy full SHA for 6b5e280 - Browse repository at this point
Copy the full SHA 6b5e280View commit details -
Make dedup critical crates compilation-compat with wasm (#271)
1. Upgrade `rand` so we can use the new "wasm_js" feature for the underlying `getrandom` dependency. 2. Change deprecated rand `gen`, `gen_range` functions. 3. Restrict tokio features. 4. Change `tempdir` to `tempfile` because `tempdir` uses a very old non-wasm compat version of `rand`, and is merged into the latter and archived. 5. Clean out some unnecessary dependencies. 6. Moved a file from parutils to utils.
Configuration menu - View commit details
-
Copy full SHA for 0474cd7 - Browse repository at this point
Copy the full SHA 0474cd7View commit details
Commits on May 1, 2025
-
Configuration menu - View commit details
-
Copy full SHA for 8c9c34d - Browse repository at this point
Copy the full SHA 8c9c34dView commit details
Commits on May 2, 2025
-
Adding session_id to requests and spans (#291)
* Creates a session_id whenever the `data_client` is used or a FileUploadSession/FileDownloader is created to be propagated to the new remote clients. * Adds a middleware to http clients to push the session_id into the `X-Xet-Session-Id` header for outgoing requests (CAS is already configured to accept this header). * Adds info-level spans to the key parts of xet-core for cases where xet-core is used as a library by long-running systems (e.g. migration service or internal systems) for aid in debugging / tracing (essentially bringing: #82 up to current minus the hf_xet logging changes).
Configuration menu - View commit details
-
Copy full SHA for f3edaa3 - Browse repository at this point
Copy the full SHA f3edaa3View commit details
Commits on May 5, 2025
-
Simplify chunking backgrounding code. (#292)
PR to simplify the code that backgrounds the chunking process. Should have no functionality change.
Configuration menu - View commit details
-
Copy full SHA for 26331d3 - Browse repository at this point
Copy the full SHA 26331d3View commit details
Commits on May 6, 2025
-
Fix clippy issues in next rust version. (#298)
This PR simply fixes clippy issues that are present in the next rust version. No functionality change.
Configuration menu - View commit details
-
Copy full SHA for 8d4958d - Browse repository at this point
Copy the full SHA 8d4958dView commit details
Commits on May 7, 2025
-
Replace passed-around threadpool refs with thread local variable (#297)
Currently, we pass references to the threadpool around the code in order to use it. However, all of this code is currently on a worker thread of the tokio runtime used to create the threadpool. This PR simplifies this by using thread local storage; each worker thread sets a reference to the runtime on start that can be accessed at any time using ThreadPool::current(). Fallback for running within an existing tokio runtime (E.g. with tokio::test) is also handled using the from_external() mechanism. There should be no functionality change, just code simplification.
Configuration menu - View commit details
-
Copy full SHA for 31beb80 - Browse repository at this point
Copy the full SHA 31beb80View commit details
Commits on May 8, 2025
-
Connect detailed upload progress to hub (#301)
This PR connects detailed upload progress to the hub in a backwards compatible way. It works by testing the number of arguments and argument names on the progress updating function. If the progress reporting function takes a single argument, this function calls it using the old method; if it has the appropriate arguments for detailed reporting -- `item_id, completed_bytes, total_bytes, update_increment` -- then it calls it using the new method. Additionally, if None is passed in, the progress reporting is disabled.
Configuration menu - View commit details
-
Copy full SHA for 0ba75fe - Browse repository at this point
Copy the full SHA 0ba75feView commit details
Commits on May 9, 2025
-
Revert "Revert "Reduce Usage of Compression Format Detection"" (#279)
Reverts #275 Co-authored-by: Joseph Godlewski <jgodlewski@huggingface.co>
Configuration menu - View commit details
-
Copy full SHA for 719f367 - Browse repository at this point
Copy the full SHA 719f367View commit details
Commits on May 10, 2025
-
A debugging utility to get file reconstruction info.
Configuration menu - View commit details
-
Copy full SHA for dfc7f0e - Browse repository at this point
Copy the full SHA dfc7f0eView commit details
Commits on May 12, 2025
-
Fix compilation issue due to api change (#309)
Fix compilation issue that caused by merging #305 that didn't see an api change.
Configuration menu - View commit details
-
Copy full SHA for 6b6dd70 - Browse repository at this point
Copy the full SHA 6b6dd70View commit details -
Fixed race condition in dependency tracking. (#302)
Testing discovered two dependency tracking issues: The first is when a file references a new xorb multiple times in non-contiguous locations. In this case, the logic will cause an assertion failure when debug_assertions are enabled, though likely the underlying logic is actually correct. The second is that tracking which xorbs are part of a given session and which xorbs have been uploaded previously was done by recording the xorb hashes when they are registered for upload. However, it turns out this needs to happen before registering the xorb as available for dedup; otherwise a race condition could cause a xorb to be incorrectly registered as already uploaded when in reality the xorb hash was in the queue and not actually added to the session registry yet. This causes the progress to be incorrectly counted as already completed. A clean fix for this is to actually get that information directly from the shard_manager instead of tracking it separately by returning whether it's deduped against the session data or the cache. This fix allows newly cut xorbs to immediately be used for dedup by other threads and correctly tracks all the information across threads.
Configuration menu - View commit details
-
Copy full SHA for 50fde9b - Browse repository at this point
Copy the full SHA 50fde9bView commit details -
Changed debug to minimal for python wheel. (#312)
Currently at O3 + minimal debug info; changing to Os + lto drops to 35mb.
Configuration menu - View commit details
-
Copy full SHA for 6f24934 - Browse repository at this point
Copy the full SHA 6f24934View commit details
This comparison is taking too long to generate.
Unfortunately it looks like we can’t render this comparison for you right now. It might be too big, or there might be something weird with your repository.
You can try running this command locally to see the comparison on your machine:
git diff v1.1.0...v1.1.1