Skip to content

Conversation

jonatack
Copy link
Member

@jonatack jonatack commented Sep 19, 2021

Based on #22831, this pull implements the suggestion in #22831 (comment) to extend RPC getnodeaddresses to return whether the address is in the tried or new table, and if it's in the new table, its refcount. Doing so allows us to better test the contents of the addrman using a public interface instead of relying on assert_debug_log in the functional tests. The pull then updates the getnodeaddresses and addpeeraddress test coverage, followed by improving the relevant asmap test coverage.

Credit to John Newbery for the idea.

@DrahtBot
Copy link
Contributor

DrahtBot commented Sep 19, 2021

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #23053 ([fuzz] Use public methods in addrman fuzz tests by jnewbery)
  • #22950 ([p2p] Pimpl AddrMan to abstract implementation details by amitiuttarwar)
  • #22872 (log: improve checkaddrman logging with duration in milliseconds by jonatack)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@jnewbery
Copy link
Contributor

Concept ACK, approach NACK. I don't think it makes sense to add this addrman-specific data into the CAddress class, which is used by many different components.

Two ways to expose them are (1) move them from CAddrInfo to CAddress or (2) use CAddrInfo instead of CAddress for the GetAddr_(), GetAddr() and GetAddresses() functions and callers.

I don't think we want to do (2) either. In general, we're trying to encapsulate addrman more. CAddrInfo is an addrman implementation-specific class, so ideally it wouldn't be exposed at all in the header. That's done in #22950, which introduces a very clear compilation firewall between implementation and public interface.

I think the best way to do this is update CAddrMan::GetAddr() to return a vector of tuples, std::vector<std::tuple<CAddress, /* tried */ bool, /* refcount*/ int>>, similar to how Select() is updated to return a std::pair<> in https://github.com/bitcoin/bitcoin/pull/22950/files#diff-164bd9e2e30f54d0a79eb7cc372309e2f2155edc6c3f051290ab078f03f6a771R89. That means that the public method is returning only what the caller needs, minimizing the coupling between implementation and client code.

@jonatack
Copy link
Member Author

Thanks @jnewbery. I agree the two approaches aren't ideal. Will have a look at your suggested approach.

@mzumsande
Copy link
Contributor

I think the best way to do this is update CAddrMan::GetAddr() to return a vector of tuples, std::vector<std::tuple<CAddress, /* tried */ bool, /* refcount*/ int>>

A vector of std::pair might be sufficient, because nRefCount==0 means it's in Tried.

@jonatack
Copy link
Member Author

jonatack commented Sep 20, 2021

A vector of std::pair might be sufficient, because nRefCount==0 means it's in Tried.

Good point, though I'm not sure we want to depend on that coupling.

(Ideally I'll try to use a struct as it's nicer to work with.)

@jonatack jonatack force-pushed the getnodeaddresses-tried-and-reference_count branch from 0cb526f to f6b5878 Compare September 20, 2021 16:53
@jnewbery
Copy link
Contributor

Ideally I'll try to use a struct as it's nicer to work with.

I'm curious why that is. I would have thought that using structured bindings to unpack a returned tuple would be just as easy: https://en.cppreference.com/w/cpp/language/structured_binding#Case_2:_binding_a_tuple-like_type.

Another possible approach would be to add a new public method GetAddressesWithRefcounts() that could be used by the RPC code. That would avoid the need to update the CConnman and PeerManager code.

@jonatack
Copy link
Member Author

Pushed an update that doesn't touch CAddress or CAddrInfo while attempting to be clear and simple.

@amitiuttarwar
Copy link
Contributor

A vector of std::pair might be sufficient, because nRefCount==0 means it's in Tried.

Good point, though I'm not sure we want to depend on that coupling.

I don't think there would be much logic coupling with a clearly documented function that returns pairs of < CAddress, multiplicity_on_new > with 0 for the second field indicating that the address is on the tried table.

Another possible approach would be to add a new public method GetAddressesWithRefcounts() that could be used by the RPC code. That would avoid the need to update the CConnman and PeerManager code.

I like the idea of minimizing impact to the p2p code paths for this test-only functionality.

@jonatack
Copy link
Member Author

Hm, this isn't urgent, so it might be an idea to try the new public method variant based on/built on #22950 to be sure it's compatible.

@jnewbery
Copy link
Contributor

Hm, this isn't urgent, so it might be an idea to try the new public method variant based on/built on #22950 to be sure it's compatible.

As long as you're just adding a new public method, it's compatible. @amitiuttarwar will need to update the header file to include that public method in the interface class declaration if this gets merged first, but I don't think that's any more/less work than merging the pimpl refactor first and then adding the public method.

I'm happy to have a first attempt at implementing my suggestion if that'd be helpful to you.

@jonatack
Copy link
Member Author

jonatack commented Sep 21, 2021

Rebased after merge of #22831.

Structs/classes: I guess I like the friendly, readable built-in access to members via named methods, and their flexibility and adaptability to changing future requirements, like adding/removing members.

I'm happy to have a first attempt at implementing my suggestion if that'd be helpful to you.

SGTM! can pull it in. With that alternative, maybe we could also remove the std::optional network arg from the existing GetAddr/GetAddresses functions.

@jonatack jonatack force-pushed the getnodeaddresses-tried-and-reference_count branch from f6b5878 to 3378ec8 Compare September 21, 2021 12:32
@Bossday Bossday mentioned this pull request Sep 21, 2021
Closed
@jnewbery
Copy link
Contributor

SGTM! can pull it in.

Here you go: https://github.com/jnewbery/bitcoin/tree/2021-09-getaddrwithrefcount. The first two commits add the new GetAddrWithRefcount() method. Let me know what you think!

With that alternative, maybe we could also remove the std::optional network arg from the existing GetAddr/GetAddresses functions.

Great idea! I added that as the final commit in the branch above.

@jonatack
Copy link
Member Author

jonatack commented Sep 21, 2021

Thanks! That is helpful. I see that I was misreading the previous suggestion.

@DrahtBot
Copy link
Contributor

🐙 This pull request conflicts with the target branch and needs rebase.

Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a "draft".

@amitiuttarwar
Copy link
Contributor

I'm uncertain about this concept (adding table & refcount in getnodeaddresses), I don't currently have a strong opinion, so am just sharing some thoughts & tradeoffs:

  • From the POV of a user, I can see how this additional information could be interesting. Since the getnodeaddresses endpoint is already essentially for super users or clients utilizing addrman, it seems reasonable to add this implementation-specific information.
  • For functional tests, I don't think it makes sense to regularly test the internal storage mechanism of addrman. That seems better suited for unit tests. So, I'm not sure this is adding much value. If the root issue is expectations around what is on new vs tried because of collisions, adding a way to construct a deterministic addrman (eg. as suggested in test: add addpeeraddress "tried", test addrman checks on restart with asmap #22831 (comment)) seems like a more robust solution.
  • For the unit tests, simply returning the address table & multiplicity if on new seems insufficient to support the current tests with a smaller interface, because several of the tests require the specific bucket & position of the address.
  • For the fuzz tests, specifically addrman_serdeser, this more informational endpoint could potentially be used to replace the operator==() functionality. That would be nice to simplify the interface, but reduce the thoroughness of the check, and require multiplicity + collisions to detect any inconsistencies in bucket hashing logic. Since the majority of the time in the fuzz test is spent filling an addrman, I think more thorough checks would be preferable.

So overall, I think the biggest benefit to this added functionality would be for superusers or clients using the RPC interface to observe or build on addrman. Additionally, there seems to be some marginal benefit for the functional tests. I am uncertain if these benefits outweigh the maintenance costs of the code.

Many thanks to @mzumsande for collaborating to form these thoughts :)

@jonatack jonatack force-pushed the getnodeaddresses-tried-and-reference_count branch from 3378ec8 to ad3c725 Compare October 2, 2021 21:36
@jonatack jonatack marked this pull request as draft October 2, 2021 21:37
@jonatack
Copy link
Member Author

jonatack commented Oct 5, 2021

It may also be handy for getnodeaddresses to return all addresses, including IsTerrible() ones, along with a boolean field indicating whether the address is terrible or not. Along with the other feedback here, it may be a good idea for getnodeaddresses to have its own GetAddr-like function that is separate from the one used by internal components.

@jnewbery
Copy link
Contributor

This overlaps with #23826. It's not completely redundant, since that PR doesn't expose the tried/new and multiplicity through a public API, but it does add a method to fetch that data that can be used by the unit tests. I think that's probably enough, since addrman is now well-contained enough that it can be fully tested through the unit tests. WDYT @jonatack?

@fanquake
Copy link
Member

What is the state of this change? It seems like it could have been (at least somewhat) obsoleted by other now-merged changes.

@jonatack
Copy link
Member Author

Will update or close after reviewing the changes.

@fanquake
Copy link
Member

fanquake commented May 4, 2022

Feel free to reopen once you've gotten a chance to look at this again.

@fanquake fanquake closed this May 4, 2022
@bitcoin bitcoin locked and limited conversation to collaborators May 4, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants