Skip to content

Conversation

martinus
Copy link
Contributor

We recently (in #28906) had OOM problems due to incorrect memory usage estimation. When PoolResource is used the estimation can be brittle because when implementing memusage estimation for a container it is not obvious when the pool can be used.

As suggested here #28906 (comment), to prevent these problems in future we can simply let the PoolResource do all the accounting. Then the memory usage estimation is always accurate, even when the pool's memory chunks cannot be used.

@DrahtBot
Copy link
Contributor

DrahtBot commented Nov 25, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept ACK hebasto, jonatack

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #31102 (Don't wipe coins cache when full and instead evict LRU clean entries by andrewtoth)
  • #28531 (improve MallocUsage() accuracy by LarryRuane)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@hebasto
Copy link
Member

hebasto commented Nov 25, 2023

Concept ACK.

Copy link
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

A couple of minor comments/questions on first look, feel free to defer until you need to update for more substantive feedback.

@@ -0,0 +1,30 @@
// Copyright (c) 2015-2023 The Bitcoin Core developers
Copy link
Member

@jonatack jonatack Nov 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per iwyu in https://cirrus-ci.com/task/6429107576111104 ("run ci"):

  • add #include <malloc_usage.h> to src/test/disconnected_transactions.cpp, src/test/pool_tests.cpp, and src/txmempool.cpp

  • grep the output also for memusage.h

nit, start year, and I think we're doing this now to avoid future need to update

Suggested change
// Copyright (c) 2015-2023 The Bitcoin Core developers
// Copyright (c) 2023-present The Bitcoin Core developers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I didn't know that iwyu is run in the CI, I'll change these 👍

@@ -266,6 +278,11 @@ class PoolResource final
{
return m_chunk_size_bytes;
}

[[nodiscard]] size_t DynamicMemoryUsage() const
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming question, would it be a good idea to use a different name (say, PoolMemoryUsage) to distinguish it while grepping from the DynamicMemoryUsage elsewhere in the codebase?

/** Compute the total memory used by allocating alloc bytes. */
static inline size_t MallocUsage(size_t alloc)
{
// Measured on libc6 2.19 on Linux.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think that comment provides value, it shows where the data came from. E.g. it makes sense to update it to newer glibc versions of they differ (they do a bit) or to at least check if they are doing something dramatically different than the version it is based on.

src/Makefile.am Outdated
@@ -197,6 +197,7 @@ BITCOIN_CORE_H = \
key_io.h \
logging.h \
logging/timer.h \
malloc_usage.h \
Copy link
Member

@jonatack jonatack Sep 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martinus PR needs rebase/update to cmake. This makefile (src/Makefile.am) has been removed.

That way it becomes usable in other code like allocators/pool
That way all containers that use the pool have accurate memory tracking.

Add test to show memory is accurately tracked, even when nodes can't be allocated by the pool.
The more accurate memory estimation interferes shows that our memory estimation for Windows
is off, as the real allocated memory is much higher. This adapts the memusage_test test so it still
works with the more correct estimation.
@martinus martinus force-pushed the 2023-11-more-accurate-pool-memory branch from 46fa057 to 818fbf0 Compare September 19, 2024 13:39
@andrewtoth
Copy link
Contributor

Could we also track available memory in the freelist and expose it publicly? Right now when the dbcache fills up, our only option is to clear it and reallocate. We don't have any insight into how much memory is available in the freelist. We can't just evict a bunch of least recently added clean entries during a periodic flush, because the memory usage reported by the map will be the same. Having the amount of freed memory can let us make more granular decisions on when to actually reallocate the cache map.

@l0rinc
Copy link
Contributor

l0rinc commented Oct 17, 2024

@martinus are you still working on this or should we review the cherry-picked changes in #31102 instead?

@martinus
Copy link
Contributor Author

Hi @l0rinc, sorry I most likely can't work on this or anything else in the forseeable future

@l0rinc
Copy link
Contributor

l0rinc commented Oct 17, 2024

Thanks for the quick response, @andrewtoth cherry-picked your changes and we'll continue there

@martinus martinus deleted the 2023-11-more-accurate-pool-memory branch October 17, 2024 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants