Skip to content

Conversation

DomT4
Copy link
Contributor

@DomT4 DomT4 commented Feb 10, 2018

  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same change?
  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your changes? Here's an example.
  • Have you successfully run brew style with your changes locally?
  • Have you successfully run brew tests with your changes locally?

Haven't worked up a test yet because I suspect there's going to be some discussion on this & how to implement it, if at all.

It has been a repetitive theme that people don't know about brew cleanup and have no idea how much space HOMEBREW_CACHE is occupying, which can be a significant chunk, especially on Macs with smaller storage capacities. Seriously, do any search on Twitter for brew cleanup as a starting data point.

This attempts to shine some light on the situation for users, whilst carving out an exemption for CI where massive caches are expected & in some way desirable. I stuck it in diagnostic but it could have a place in brew config or even brew --cache somewhere. Opinions on everything welcome.

@MikeMcQuaid
Copy link
Member

@DomT4 How big is your cache currently and how long does this take you to run with/without this patch (run it twice to test the cold/warm IO cache)? Bonus points for @ilovezfs if you want to try that too.

@DomT4
Copy link
Contributor Author

DomT4 commented Feb 11, 2018

With an effectively empty cache:

Before this PR:

brew doctor  10.41s user 4.71s system 93% cpu 16.107 total
brew doctor  9.42s user 3.37s system 98% cpu 12.963 total

After this PR:

brew doctor  9.13s user 2.98s system 99% cpu 12.171 total
brew doctor  9.09s user 2.97s system 99% cpu 12.120 total

With a cache containing 10 copies of qt-everywhere-src-5.10.0.tar.xz, weighing in at 3906MB:

Before this PR:

brew doctor  9.77s user 3.23s system 99% cpu 13.120 total
brew doctor  9.46s user 3.13s system 99% cpu 12.692 total

After this PR:

brew doctor  9.31s user 3.17s system 99% cpu 12.600 total
brew doctor  9.24s user 3.01s system 99% cpu 12.317 total

The everything-after-the-first-run results are presumably twisted a bit by "(run it twice to test the cold/warm IO cache)?" but doesn't show a huge slow down.

@ilovezfs
Copy link
Contributor

I think you need something like

        return unless cache.to_i > 2000

Otherwise it doesn't work for me.

@ilovezfs
Copy link
Contributor

irb(main):001:0> "17619" > "2000"
=> false

@DomT4
Copy link
Contributor Author

DomT4 commented Feb 11, 2018

Oh, it's working here:

Please note that these warnings are just used to help the Homebrew maintainers
with debugging if you file an issue. If everything you use Homebrew for is
working fine: please don't worry or file an issue; just ignore this. Thanks!

Warning: Your HOMEBREW_CACHE is using more than 2GB of disk space.
You may wish to consider running `brew cleanup`.

Perhaps I didn't expand the cache hard enough during testing.

@ilovezfs
Copy link
Contributor

Yeah sounds like your cache never went over the 2GB mark.

@DomT4
Copy link
Contributor Author

DomT4 commented Feb 11, 2018

3.8G /usr/local/var/homebrewcache atm.

cache = Utils.popen_read("du -sm #{HOMEBREW_CACHE}").split.first
return unless cache > "2000"
<<~EOS
Your HOMEBREW_CACHE is using more than 2GB of disk space.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be worth printing the actual value, too, given you already computed it.

@DomT4
Copy link
Contributor Author

DomT4 commented Feb 11, 2018

7.6G /usr/local/var/homebrewcache works fine.
12G /usr/local/var/homebrewcache doesn't work as expected.

So it has issues once we pass the 10000 mark apparently. Will look into that.

@ilovezfs
Copy link
Contributor

Can we use HOMEBREW_CACHE.abv?

@ilovezfs
Copy link
Contributor

and/or HOMEBREW_CACHE.disk_usage

@DomT4
Copy link
Contributor Author

DomT4 commented Feb 11, 2018

Yeah, we can. In initial testing it was slightly slower than simply throwing out to du, and I can repro that locally still:

brew doctor  9.44s user 3.04s system 99% cpu 12.554 total
brew doctor  9.49s user 3.11s system 99% cpu 12.694 total

But it does work:

diff --git a/Library/Homebrew/diagnostic.rb b/Library/Homebrew/diagnostic.rb
index 70f3e3881..c9b37d6d5 100644
--- a/Library/Homebrew/diagnostic.rb
+++ b/Library/Homebrew/diagnostic.rb
@@ -838,10 +838,10 @@ module Homebrew
       def check_for_large_cache
         return unless HOMEBREW_CACHE.exist?
         return if ENV["CI"] # CI can be expected to have a large cache.
-        cache = Utils.popen_read("du -sm #{HOMEBREW_CACHE}").split.first
-        return unless cache > "2000"
+        cache_size = HOMEBREW_CACHE.abv.split.last
+        return unless cache_size.to_i > 2
         <<~EOS
-          Your HOMEBREW_CACHE is using more than 2GB of disk space.
+          Your HOMEBREW_CACHE is using #{cache_size} of disk space.
           You may wish to consider running `brew cleanup`.
         EOS
       end

@ilovezfs
Copy link
Contributor

HOMEBREW_CACHE.disk_usage > 2147483648

since 2 is ambiguous depending on the units in the human readable format

@ilovezfs
Copy link
Contributor

Then use disk_usage_readable(cache_size) to format it :)

@DomT4
Copy link
Contributor Author

DomT4 commented Feb 11, 2018

Apparently has to be cache_size > 214_748_364_8 or I get shouted at by brew style.

@DomT4
Copy link
Contributor Author

DomT4 commented Feb 11, 2018

Not sure if that's still valid/correct. Don't see a problem with long numbers personally but there we go 🙈.

@DomT4
Copy link
Contributor Author

DomT4 commented Feb 11, 2018

Specifically:

Offenses:

Library/Homebrew/diagnostic.rb:842:36: C: Use underscores(_) as decimal mark and separate every 3 digits with them.

@ilovezfs
Copy link
Contributor

2_147_483_648 doesn't work?

@DomT4
Copy link
Contributor Author

DomT4 commented Feb 11, 2018

Yeah, that's fine. Didn't know how pedantic we were getting about the whole every three digits thing. The error message isn't wildly clear.

@ilovezfs
Copy link
Contributor

or just 2**31

@DomT4
Copy link
Contributor Author

DomT4 commented Feb 11, 2018

Wasn't sure if shortening it made it less readable at a glance/less friendly.

@DomT4
Copy link
Contributor Author

DomT4 commented Feb 11, 2018

Although I guess we could just do return unless cache_size > 2**31 # ~2GB if we wanted to retain the instantly-explanatory nature of the full number.

@ilovezfs
Copy link
Contributor

2**31 is the most readable IMO

@ilovezfs
Copy link
Contributor

def disk_usage_readable(size_in_bytes)
  if size_in_bytes >= 1_073_741_824
    size = size_in_bytes.to_f / 1_073_741_824
    unit = "GB"
  elsif size_in_bytes >= 1_048_576
    size = size_in_bytes.to_f / 1_048_576
    unit = "MB"
  elsif size_in_bytes >= 1_024
    size = size_in_bytes.to_f / 1_024
    unit = "KB"
  else
    size = size_in_bytes
    unit = "B"
  end

  # avoid trailing zero after decimal point
  if ((size * 10).to_i % 10).zero?
    "#{size.to_i}#{unit}"
  else
    "#{format("%.1f", size)}#{unit}"
  end
end

So I guess we might as well be consistent and use 2_147_483_648

@DomT4
Copy link
Contributor Author

DomT4 commented Feb 11, 2018

Alright, tweaked.

@ilovezfs
Copy link
Contributor

So in terms of time, on APFS with 4.1GB cache

first doctor run without this PR
real 0m6.313s
user 0m2.081s
sys 0m2.657s

second
real 0m2.850s
user 0m1.897s
sys 0m0.871s

third
real 0m2.924s
user 0m1.964s
sys 0m0.892s

After this PR,

fourth
real 0m5.702s
user 0m2.569s
sys 0m2.185s

fifth
real 0m3.988s
user 0m2.478s
sys 0m1.412s

sixth
real 0m3.838s
user 0m2.424s
sys 0m1.346s

@DomT4
Copy link
Contributor Author

DomT4 commented Feb 11, 2018

Before:

brew doctor  9.83s user 3.25s system 99% cpu 13.197 total
brew doctor  9.31s user 3.04s system 99% cpu 12.411 total
brew doctor  9.30s user 3.02s system 99% cpu 12.393 total
brew doctor  9.30s user 3.05s system 99% cpu 12.409 total
brew doctor  9.72s user 3.18s system 99% cpu 12.999 total

After:

brew doctor  10.03s user 3.81s system 90% cpu 15.229 total
brew doctor  9.57s user 3.11s system 99% cpu 12.782 total
brew doctor  9.35s user 3.05s system 99% cpu 12.464 total
brew doctor  9.56s user 3.15s system 99% cpu 12.823 total
brew doctor  9.33s user 3.03s system 99% cpu 12.437 total

So a bit of a performance hit from using Ruby rather than a pure du -sm comparison, potentially. Not a huge one, and doctor is already a slow command, so perhaps not too hideous. Tests run with a 4.2GB cache on APFS, FWIW.

@MikeMcQuaid MikeMcQuaid merged commit 8c98317 into Homebrew:master Feb 12, 2018
@MikeMcQuaid
Copy link
Member

Great work as usual @DomT4; thanks!

@DomT4
Copy link
Contributor Author

DomT4 commented Feb 12, 2018

Managed a mistake so follow up in #3791. Thanks for the merge, and apologies for my "oops" 😅.

@DomT4 DomT4 deleted the cache_size branch February 12, 2018 10:15
@ilovezfs
Copy link
Contributor

In light of #3798 I think we may want to revert this. It will be an endless source of confusion because brew cleanup doesn't guarantee getting under the 2G threshold.

@DomT4
Copy link
Contributor Author

DomT4 commented Feb 13, 2018

I considered that before this PR. My initial thought was to advise the ‘-s’ flag or a shorter default prune time. I think we have to do something around cache size notification given the confusion it already causes, but as I wrote in my initial post I’m open to ideas on quite what. I think this is a decent solution that’ll have teething issues around working out a good size & working out what advice to give, FWIW.

@ilovezfs
Copy link
Contributor

#3799

@DomT4
Copy link
Contributor Author

DomT4 commented Feb 13, 2018

There’s a lot we can consider in terms of tweaking the messaging or placement here or tweaking the default prune time for non-CI that reduces the confusion. Mike has said before in an ideal world cleanup could be run automatically for most users IIRC, and I’d certainly agree with that, but it remains a very slow operation which limits what we can do. I think this is something that needed better UX, and will need better UX even if this isn’t quite a perfect solution.

@DomT4
Copy link
Contributor Author

DomT4 commented Feb 13, 2018

I think removing it completely is an overly blunt reaction, especially considering the prior problem will remain & will be something users keep being confused about, but ultimately you two are the maintainers here.

@ilovezfs
Copy link
Contributor

I disagree with the idea that there's anything wrong with a larger cache.

@DomT4
Copy link
Contributor Author

DomT4 commented Feb 13, 2018

Sure, but you’re not giving users a choice in that because a significant chunk of them don’t know about the cache size & don’t know the size of it. If you’re running a 250GB or 500GB MacBook & your cache is taking up 30GB that’s a huge chunk.

@DomT4
Copy link
Contributor Author

DomT4 commented Feb 13, 2018

https://twitter.com/pidster/status/963005893272010753

https://twitter.com/inoueh/status/961863241054892032

https://twitter.com/alexcarolc/status/960824943465418752

https://twitter.com/hannahwarmbier/status/959814433840222209

These are not users actively loving having a huge cache. These are people not knowing about how to shift it or not remembering the command exists. It is an issue. It was an issue regularly when I used to primarily operate the Twitter account. It is something that has been discussed as an issue by maintainers repeatedly.

@ilovezfs
Copy link
Contributor

The man page is there. If people don't read it, OK.

Doctor should warn about actual problems not imagined problems.

@MikeMcQuaid
Copy link
Member

I think we should alert users to brew cleanup and run it automatically for them if/when we can but I don't think a brew doctor warning is the right place for this any more, sorry @DomT4.

@DomT4
Copy link
Contributor Author

DomT4 commented Feb 13, 2018

Do you have any broad implementation ideas on that you’d be happier with so I’m not working up potential avenues of resolution that run counter to something you already have in mind?

ILZ: I don’t think “RTFM” is actually helpful advice to users. We can barely get them to read caveats.

@MikeMcQuaid
Copy link
Member

Do you have any broad implementation ideas on that you’d be happier with so I’m not working up potential avenues of resolution that run counter to something you already have in mind?

Yep: #568.

@Homebrew Homebrew locked and limited conversation to collaborators May 4, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants