-
Notifications
You must be signed in to change notification settings - Fork 25
Comparing changes
Open a pull request
base repository: huggingface/xet-core
base: v1.1.2
head repository: huggingface/xet-core
compare: v1.1.3-dev0
- 5 commits
- 49 files changed
- 3 contributors
Commits on May 19, 2025
-
Updates out-of-sync Cargo.lock in hf_xet/ (#341)
The version in hf_xet/Cargo.lock was not updated in the current main/
Configuration menu - View commit details
-
Copy full SHA for b364582 - Browse repository at this point
Copy the full SHA b364582View commit details -
Incremental progress on upload_xorb with retry_wrapper (#333)
This PR implements incremental progress reporting on the upload_xorb function, reporting progress every 512KB of data uploaded. In addition, errors are retried using the same retry policy as the other clients. To get around Body::wrap_stream preventing retries due to cloning failing, this PR adds a simple retry wrapper utility that allows the entire request to be retried instead of doing it as part of the middleware layer.
Configuration menu - View commit details
-
Copy full SHA for 0d17409 - Browse repository at this point
Copy the full SHA 0d17409View commit details -
Track total processed bytes and total transferred bytes (#328)
With deduplication, just tracking the total processed bytes can give a false impression of actual progress when uploading files; the user would often see a huge jump when hitting a deduplicated part. This PR allows us to report bytes uploaded or downloaded separately from the bytes processed, which would allow us to correctly surface network utilization. For example, one possible way to show this would be: ``` Data Processed #####################╺━━━━━━━━━━━━━━━━ 58% • 1.8/3.1 GB • 731.6 MB/s • 0:00:10 New Data Uploaded ######╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9% • 65.9/731.7 MB • 131.0 MB/s • -:--:-- data_up/d2.dat #############################╸━━━━━━━━ 67% • 67.1/100.0 MB • 568.1 MB/s • 0:00:01 ```
Configuration menu - View commit details
-
Copy full SHA for bef6a9a - Browse repository at this point
Copy the full SHA bef6a9aView commit details
Commits on May 20, 2025
-
Streamline and aggregate file updates for reporting to python (#340)
With the current incremental progress updates, the amount of updates going to python is substantial, and each has to acquire a global GIL lock. This negatively affects the upload speed on fast connections. This PR introduces an intermediate aggregation class that quickly aggregates all the incoming progress updates, then sends the aggregated update list to hf_xet once every 200 ms. With this, the thread contention experienced by the frequent incremental updates is eliminated while still reporting accurate progress to the user.
Configuration menu - View commit details
-
Copy full SHA for 4faec0b - Browse repository at this point
Copy the full SHA 4faec0bView commit details -
Merging Cargo.toml dependencies into workspace Cargo.toml (#339)
Co-authored-by: Brian Ronan <brian.ronan@huggingface.co>
Configuration menu - View commit details
-
Copy full SHA for c465076 - Browse repository at this point
Copy the full SHA c465076View commit details
This comparison is taking too long to generate.
Unfortunately it looks like we can’t render this comparison for you right now. It might be too big, or there might be something weird with your repository.
You can try running this command locally to see the comparison on your machine:
git diff v1.1.2...v1.1.3-dev0