Skip to content

Conversation

stv0g
Copy link
Contributor

@stv0g stv0g commented Feb 22, 2025

git-annex encodes the apparent size of files their symlinks.

gdu now is capable of extracting those sizes from the broken symlinks to calculate the total size of git-annex repositories.

Reference: https://git-annex.branchable.com/internals/key_format/

Note: real usage remains zero. gdu needs to be invoked via:

gdu --follow-symlinks --show-apparent-size --show-annexed-size

git-annex encodes the apparent size of files their symlinks.

gdu now is capable of extracting those sizes from the broken symlinks to calculate the total size of git-annex repositories.

Note: real usage remains zero. gdu needs to be invoked via:
    gdu --follow-symlinks --show-apparent-size --show-annexed-size
@stv0g
Copy link
Contributor Author

stv0g commented Feb 22, 2025

I fully understand, if this PR is a bit out-of-scope for gdu. Maybe other git-annex users might find it useful :)

@dundee dundee self-requested a review February 28, 2025 16:34
Copy link

codecov bot commented Feb 28, 2025

Codecov Report

Attention: Patch coverage is 79.16667% with 15 lines in your changes missing coverage. Please review.

Project coverage is 87.55%. Comparing base (a0fd828) to head (917a6ec).

Files with missing lines Patch % Lines
cmd/gdu/main.go 0.00% 4 Missing ⚠️
cmd/gdu/app/app.go 0.00% 2 Missing and 1 partial ⚠️
pkg/analyze/symlink.go 86.95% 2 Missing and 1 partial ⚠️
pkg/analyze/sequential.go 33.33% 2 Missing ⚠️
pkg/analyze/stored.go 0.00% 2 Missing ⚠️
internal/testanalyze/analyze.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #404      +/-   ##
==========================================
- Coverage   87.64%   87.55%   -0.10%     
==========================================
  Files          44       46       +2     
  Lines        4329     4386      +57     
==========================================
+ Hits         3794     3840      +46     
- Misses        460      470      +10     
- Partials       75       76       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dundee
Copy link
Owner

dundee commented Feb 28, 2025

I have fixed few small issues. Could you cover your logic with tests?

@stv0g
Copy link
Contributor Author

stv0g commented Mar 1, 2025

Hi Daniel,

Yes, I can implement some test cases. The use of EvalSymlinks unfortunately, does not work for me, as it iteratively calls os.Readlink which break if any component of the link does not exist.

git-annex stores files in this hierarchy:

      ├─► .git                                                                                                                                                                                                                                                                                                                   
      │    └─► annex                                                                                                                                                                                                                                                                                                          
      │         └─► objects                                                                                                                                                                                                                                                                                              
      │              └─► 7d                                                                                                                                                                                                                                                                                        
      │                   └─► 86                                                                                                                                                                                                                                                                      
      │                        └─►SHA256E-s43632--7d865e959b2466918c9863afca942d0fb89d7c9ac0c99bafc3749504ded97730.jpg                                                                                                                                                                              
      │                             └─► SHA256E-s43632--7d865e959b2466918c9863afca942d0fb89d7c9ac0c99bafc3749504ded97730.jpg                                                                                                                                                                                        
      ├──► existing_file.jpg -> .git/annex/objects/7d/86/SHA256E-s43632--7d865e959b2466918c9863afca942d0fb89d7c9ac0c99bafc3749504ded97730.jpg                                                                                                                                                                          
      └──► missing_file.zip -> .git/annex/objects/08/d2/SHA256E-s744506832--08d2daced60b5eb6509044d5eefca82e7a6899350f49adc0083014229739515e.zip    (broken symlink)                                                                                                                                                                       

Due two-levels of hash directories, and a dedicated directory per link, causes EvalSymlinks to fail.

For the purpose of this PR, I am only interested in the -s<SIZE>-- fragment of the symlink.

@dundee
Copy link
Owner

dundee commented Mar 21, 2025

I have updated the code to contain both EvalSymlinks (which is needed for proper handling of symlinks in subdirectories) and Readlink fot git-annex. Wdyt?

It seems to work for me fine.

$ go run github.com/dundee/gdu/v5/cmd/gdu -LAan ~/annex
  1.1   MiB /.git
@ 923.0 MiB file.mp4

 $ go run github.com/dundee/gdu/v5/cmd/gdu -an ~/annex
   1.1 MiB /.git
@  202 B file.mp4

Sorry for taking so long, I wanted to get hands on Git annex.

@stv0g
Copy link
Contributor Author

stv0g commented Mar 22, 2025

Hey @dundee,

oh thats great :) Yes, I think this is the best solution.

Thanks for looking into it :) I did not expect, that you will look into git-annex :)

@stv0g
Copy link
Contributor Author

stv0g commented Mar 22, 2025

I've also pushed a fix to make golangci-lint happy.

@dundee dundee merged commit 8e4a346 into dundee:master Mar 25, 2025
8 checks passed
@dundee
Copy link
Owner

dundee commented Mar 26, 2025

Great, thanks for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants