Skip to content

Conversation

fjahr
Copy link
Contributor

@fjahr fjahr commented Nov 4, 2023

This is a step towards making asmap usage more accessible. Currently an asmap file needs to be acquired by there user from some location or the user needs to generate one themselves. Then they need to move the file to the right place in datadir or pass the path to the file as -asmap=PATH in order to use the asmap feature.

The change here allows for builds to embed asmap data into the bitcoind binary which makes it possible to use the feature without handling of the asmap file by the user. If the user starts bitcoind with -asmap the embedded data will be used for bucketing of nodes.

The data lives in the repository at src/node/data/ip_asn.dat and can be replaced with a new version at any time. The idea is that the data should be updated with every release. By default the data at that location is embedded into the binary but there is also a build option to prevent this (-DWITH_EMBEDDED_ASMAP=NO). In this case the original behavior of the -asmap option is maintained.

@DrahtBot
Copy link
Contributor

DrahtBot commented Nov 4, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/28792.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept ACK Sjors, jonatack
Stale ACK sipa

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #33026 (test, refactor: Embedded ASmap selected preparatory work by fjahr)
  • #31868 ([IBD] specialize block serialization by l0rinc)
  • #30595 (kernel: Introduce initial C header API by TheCharlatan)
  • #29641 (scripted-diff: Use LogInfo over LogPrintf [WIP, NOMERGE, DRAFT] by maflcko)
  • #17783 (common: Disallow calling IsArgSet() on ALLOW_LIST options by ryanofsky)
  • #17581 (refactor: Remove settings merge reverse precedence code by ryanofsky)
  • #17580 (refactor: Add ALLOW_LIST flags and enforce usage in CheckArgFlags by ryanofsky)
  • #17493 (util: Forbid ambiguous multiple assignments in config file by ryanofsky)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@fjahr fjahr mentioned this pull request Nov 4, 2023
4 tasks
@fjahr fjahr force-pushed the 2023-10-asmap-in-source branch 3 times, most recently from b8b09a7 to f9288bc Compare November 4, 2023 23:48
@sipa
Copy link
Member

sipa commented Nov 5, 2023

Would it make sense to only check in the asmap.dat file into the repository, and then have the build system convert it to a .h at compile time (similar to the approach used in the src/test/data directory)? That would mean there is also a directly-manipulable file in the repository (that can be used with asmap-tool etc).

@maflcko
Copy link
Member

maflcko commented Nov 5, 2023

How much data would this add per year to the repo, assuming updates happen?

@sipa
Copy link
Member

sipa commented Nov 5, 2023

@maflcko The asmap file added here is around 1.6 MiB, so assuming we update for every 6 months, probably a bit over 3 MiB per year (gzip compresses asmap files only ~5%).

@luke-jr
Copy link
Member

luke-jr commented Nov 8, 2023

Why is the option of loading it from a file even in dev builds, not considered?

@fjahr
Copy link
Contributor Author

fjahr commented Nov 9, 2023

Would it make sense to only check in the asmap.dat file into the repository, and then have the build system convert it to a .h at compile time (similar to the approach used in the src/test/data directory)? That would mean there is also a directly-manipulable file in the repository (that can be used with asmap-tool etc).

Yeah, that would work as well. I have drafted that approach here: fjahr@0f9109f Details can change of course (like keeping the raw file in init) but if people prefer that approach, I will pull this into here.

@fjahr
Copy link
Contributor Author

fjahr commented Nov 9, 2023

Why is the option of loading it from a file even in dev builds, not considered?

It is considered. Loading from a file is the only option for dev builds if we go the other route mentioned in the description: "The alternative approach is not having the data in the source code but only adding it during the build phase of a release." But this also comes with the aforementioned downsides.

@willcl-ark
Copy link
Member

A general question about this approach: do most contemporary OSes handle loading large amounts of (potentially) unused data into physical memory well-enough, that they would (likely) not load this extra 1.5-3MB of data in the case that the -asmap is not being used? Perhaps this will be handled by the demand paging system? If not, it seems to be a bit of a shame to increase binary size by 20% in all cases.

It seems to make most sense to me to take the path suggested by @sipa, including the .dat file in the repo, and having it converted to a .h at compile time. It might also be nice to have a configuration flag to disable the embedded header creation for faster builds.

@fjahr fjahr force-pushed the 2023-10-asmap-in-source branch from f9288bc to 3bcc1ca Compare December 14, 2023 23:24
@DrahtBot DrahtBot mentioned this pull request Jun 25, 2024
@fjahr fjahr changed the title Embedding ASMap files as binary dump header file Embed default ASMap as binary dump header file Aug 27, 2024
@fjahr fjahr force-pushed the 2023-10-asmap-in-source branch from 3bcc1ca to be1e04a Compare August 28, 2024 22:25
@fjahr fjahr marked this pull request as ready for review August 28, 2024 22:26
@fjahr fjahr force-pushed the 2023-10-asmap-in-source branch from be1e04a to 5167707 Compare August 28, 2024 23:31
@fjahr
Copy link
Contributor Author

fjahr commented Aug 28, 2024

I am picking this back up since we have come a long way in terms of tooling outside of Bitcoin Core and lately I was just waiting for cmake and v28 feature freeze. I will suggest an early merge in the cycle for v29 so more people are encouraged to use one steady file over a longer period of time.

I am now using the suggested approach of building the header file at compile time and I added the functionality for both autotools and cmake (the autotools one has existed for a while, I can also remove it if people think it's unnecessary). Of course this is rebased to make cmake available. First time I am editing the cmake build system so there are probably things to improve... And there is also a command for asmap-tool.py to generate the header file.

I am adding the latest (1724248800) asmap file from asmap-data, the corresponding creation process can be reviewed in the issue and PR.

I have not added a config option to skip the asmap header file but I will look into that next.

@Sjors
Copy link
Member

Sjors commented Aug 29, 2024

I am now using the suggested approach of building the header file at compile time

I think the suggestion was to not include the header itself in that case.

the autotools one has existed for a while, I can also remove it if people think it's unnecessary

Maybe keep it in until #30664 is merged.

@fjahr fjahr force-pushed the 2023-10-asmap-in-source branch from 5167707 to e5a25fd Compare August 29, 2024 18:33
@fjahr
Copy link
Contributor Author

fjahr commented Jul 20, 2025

Rebased and I shaved off 4 easy to review commits from here and put them into #33026. Maybe splitting this off helps to move things along a bit faster.

sipa and others added 4 commits July 25, 2025 19:14
Calculate the asmap version only in one place: A dedicated function in util/asmap.

The version was also referred to as asmap checksum in several places. To avoid confusion call it asmap version everywhere.
@fjahr fjahr force-pushed the 2023-10-asmap-in-source branch from c7636dd to afffda8 Compare July 25, 2025 17:28
@DrahtBot
Copy link
Contributor

🚧 At least one of the CI tasks failed.
Task no wallet, libbitcoinkernel: https://github.com/bitcoin/bitcoin/runs/46745894874
LLM reason (✨ experimental): Build failure caused by incorrect constructor invocation of 'NetGroupManager' in test code.

Hints

Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:

  • Possibly due to a silent merge conflict (the changes in this pull request being
    incompatible with the current code in the target branch). If so, make sure to rebase on the latest
    commit of the target branch.

  • A sanitizer issue, which can only be found by compiling with the sanitizer and running the
    affected test.

  • An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

@fjahr fjahr force-pushed the 2023-10-asmap-in-source branch from afffda8 to d3378eb Compare July 31, 2025 22:02
This prevents holding the asmap data in memory twice.
@fjahr fjahr force-pushed the 2023-10-asmap-in-source branch from d3378eb to 30ce98b Compare July 31, 2025 22:25
@DrahtBot DrahtBot removed the CI failed label Aug 1, 2025
@fjahr fjahr force-pushed the 2023-10-asmap-in-source branch from 30ce98b to 4e37dc9 Compare August 18, 2025 16:32
@fjahr
Copy link
Contributor Author

fjahr commented Aug 18, 2025

Updated the map to that of the latest run: asmap/asmap-data#30

@achow101 achow101 modified the milestones: 30.0, 31.0 Aug 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.