Skip to content

Conversation

fanquake
Copy link
Member

@fanquake fanquake commented Oct 18, 2023

Using std::endian from C++20 allows us to drop some amount of own code, including infra in the build system (which means we don't have to port and review it for CMake, only to delete it shortly after switching to C++20).

Note that C++23 would take this even further, with the introduction of std::byteswap.

@DrahtBot
Copy link
Contributor

DrahtBot commented Oct 18, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept ACK dergoegge, kristapsk, theStack

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #29057 (Replace non-standard CLZ builtins with c++20's bit_width by theuni)
  • #29036 (Add missing byteswap functions for MSVC by theuni)
  • #21590 (Safegcd-based modular inverses in MuHash3072 by sipa)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@dergoegge
Copy link
Member

Concept ACK

1 similar comment
@kristapsk
Copy link
Contributor

Concept ACK

@theStack
Copy link
Contributor

Concept ACK

@aureleoules
Copy link
Contributor

I've been experimenting with benchmark comparisons between pulls and master on my test coverage tool corecheck (https://corecheck.dev/bitcoin/bitcoin/pulls/28674).
It seems that this pull request is negatively impacting a bunch of benchmarks:
image

Note that the benchmarks are being run on ARM64 CPUs, but I tested locally on an x86 and the performance loss is roughly the same.

@fanquake
Copy link
Member Author

fanquake commented Dec 5, 2023

cc @theuni we might not always be getting bswaps

@theuni
Copy link
Member

theuni commented Dec 5, 2023

@aureleoules Thanks for testing!

I tested this by comparing the x86_64 asm output of a simple test file and confirmed that it compiled down to bswaps as necessary. I didn't do the same for other arches. Perhaps others are missing the critical optimizations? :(

Could you say more about your compiler/flags? I would expect to see the nasty behavior you're seeing:

  • On old compilers
  • On non-bleeding-edge MSVC
  • Without optimizations.

Would you mind pasting the flags use for compile? Without -O2 or better, for example, I would expect these results.

@maflcko
Copy link
Member

maflcko commented Dec 5, 2023

@aureleoules
Copy link
Contributor

I'd presume it is corecheck/coverage-worker@7d47674/entrypoint.sh#L58 (default flags -O2 on Ubuntu Jammy)

Yes that is correct. Also the compiler installed is g++-11.

@aureleoules
Copy link
Contributor

Here is the output of the configure script of the corecheck job if this helps:

1701686294337,checking for pkg-config... /usr/bin/pkg-config
1701686294338,checking pkg-config is at least version 0.9.0... yes
1701686294378,checking build system type... aarch64-unknown-linux-gnu
1701686294379,checking host system type... aarch64-unknown-linux-gnu
1701686294384,checking for a BSD-compatible install... /usr/bin/install -c
1701686294387,checking whether build environment is sane... yes
1701686294393,checking for a race-free mkdir -p... /usr/bin/mkdir -p
1701686294393,checking for gawk... no
1701686294393,checking for mawk... mawk
1701686294398,checking whether make sets $(MAKE)... yes
1701686294402,checking whether make supports nested variables... yes
1701686294406,checking whether to enable maintainer-specific portions of Makefiles... yes
1701686294406,checking whether make supports nested variables... (cached) yes
1701686294406,checking for g++... g++
1701686294456,checking whether the C++ compiler works... yes
1701686294456,checking for C++ compiler default output file name... a.out
1701686294490,checking for suffix of executables... 
1701686294529,checking whether we are cross compiling... no
1701686294545,checking for suffix of object files... o
1701686294558,checking whether the compiler supports GNU C++... yes
1701686294573,checking whether g++ accepts -g... yes
1701686294631,checking for g++ option to enable C++11 features... none needed
1701686294636,checking whether make supports the include directive... yes (GNU style)
1701686294660,checking dependency style of g++... gcc3
1701686294749,checking whether g++ supports C++20 features with -std=c++20... yes
1701686294771,checking whether the compiler supports GNU Objective C++... no
1701686294795,checking whether g++ -std=c++20 accepts -g... no
1701686294818,checking dependency style of g++ -std=c++20... gcc3
1701686294820,checking how to print strings... printf
1701686294820,checking for gcc... gcc
1701686294854,checking whether the compiler supports GNU C... yes
1701686294867,checking whether gcc accepts -g... yes
1701686294912,checking for gcc option to enable C11 features... none needed
1701686294938,checking whether gcc understands -c and -o together... yes
1701686294961,checking dependency style of gcc... gcc3
1701686294964,checking for a sed that does not truncate output... /usr/bin/sed
1701686294966,checking for grep that handles long lines and -e... /usr/bin/grep
1701686294967,checking for egrep... /usr/bin/grep -E
1701686294968,checking for fgrep... /usr/bin/grep -F
1701686294972,checking for ld used by gcc... /usr/bin/ld
1701686294973,checking if the linker (/usr/bin/ld) is GNU ld... yes
1701686294975,checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B
1701686294991,checking the name lister (/usr/bin/nm -B) interface... BSD nm
1701686294991,checking whether ln -s works... yes
1701686294994,checking the maximum length of command line arguments... 1966080
1701686294996,checking how to convert aarch64-unknown-linux-gnu file names to aarch64-unknown-linux-gnu format... func_convert_file_noop
1701686294996,checking how to convert aarch64-unknown-linux-gnu file names to toolchain format... func_convert_file_noop
1701686294996,checking for /usr/bin/ld option to reload object files... -r
1701686294996,checking for objdump... objdump
1701686294997,checking how to recognize dependent libraries... pass_all
1701686294997,checking for dlltool... no
1701686294997,"checking how to associate runtime and link libraries... printf %s
"
1701686294997,checking for ar... ar
1701686295017,checking for archiver @FILE support... @
1701686295017,checking for strip... strip
1701686295017,checking for ranlib... ranlib
1701686295067,checking command to parse /usr/bin/nm -B output from gcc object... ok
1701686295068,checking for sysroot... no
1701686295071,checking for a working dd... /usr/bin/dd
1701686295073,checking how to truncate binary pipes... /usr/bin/dd bs=4096 count=1
1701686295074,checking for mt... no
1701686295076,checking if : is a manifest tool... no
1701686295093,checking for stdio.h... yes
1701686295112,checking for stdlib.h... yes
1701686295133,checking for string.h... yes
1701686295155,checking for inttypes.h... yes
1701686295177,checking for stdint.h... yes
1701686295198,checking for strings.h... yes
1701686295220,checking for sys/stat.h... yes
1701686295242,checking for sys/types.h... yes
1701686295269,checking for unistd.h... yes
1701686295297,checking for dlfcn.h... yes
1701686295300,checking for objdir... .libs
1701686295360,checking if gcc supports -fno-rtti -fno-exceptions... no
1701686295360,checking for gcc option to produce PIC... -fPIC -DPIC
1701686295375,checking if gcc PIC flag -fPIC -DPIC works... yes
1701686295435,checking if gcc static flag -static works... yes
1701686295455,checking if gcc supports -c -o file.o... yes
1701686295455,checking if gcc supports -c -o file.o... (cached) yes
1701686295462,checking whether the gcc linker (/usr/bin/ld) supports shared libraries... yes
1701686295481,checking whether -lc should be explicitly linked in... no
1701686295520,checking dynamic linker characteristics... GNU/Linux ld.so
1701686295520,checking how to hardcode library paths into programs... immediate
1701686295521,checking whether stripping libraries is possible... yes
1701686295521,checking if libtool supports shared libraries... yes
1701686295521,checking whether to build shared libraries... yes
1701686295521,checking whether to build static libraries... yes
1701686295546,checking how to run the C++ preprocessor... g++ -std=c++20 -E
1701686295620,checking for ld used by g++ -std=c++20... /usr/bin/ld
1701686295622,checking if the linker (/usr/bin/ld) is GNU ld... yes
1701686295627,checking whether the g++ -std=c++20 linker (/usr/bin/ld) supports shared libraries... yes
1701686295680,checking for g++ -std=c++20 option to produce PIC... -fPIC -DPIC
1701686295699,checking if g++ -std=c++20 PIC flag -fPIC -DPIC works... yes
1701686295759,checking if g++ -std=c++20 static flag -static works... yes
1701686295783,checking if g++ -std=c++20 supports -c -o file.o... yes
1701686295783,checking if g++ -std=c++20 supports -c -o file.o... (cached) yes
1701686295783,checking whether the g++ -std=c++20 linker (/usr/bin/ld) supports shared libraries... yes
1701686295786,checking dynamic linker characteristics... (cached) GNU/Linux ld.so
1701686295786,checking how to hardcode library paths into programs... immediate
1701686295787,checking for ar... /usr/bin/ar
1701686295787,checking for gcov... /usr/bin/gcov
1701686295787,checking for llvm-cov... no
1701686295787,checking for lcov... /usr/bin/lcov
1701686295787,checking for python3.9... no
1701686295787,checking for python3.10... /usr/bin/python3.10
1701686295788,checking for genhtml... /usr/bin/genhtml
1701686295788,checking for git... /usr/bin/git
1701686295788,checking for ccache... /usr/bin/ccache
1701686295788,checking for xgettext... no
1701686295788,checking for hexdump... /usr/bin/hexdump
1701686295788,checking for objcopy... /usr/bin/objcopy
1701686295789,checking for doxygen... no
1701686295805,checking whether C++ compiler accepts -Werror... yes
1701686295844,"checking whether the linker accepts -Wl,--fatal-warnings... yes"
1701686295864,checking whether C++ compiler accepts -Wall... yes
1701686295881,checking whether C++ compiler accepts -Wextra... yes
1701686295892,checking whether C++ compiler accepts -Wgnu... no
1701686295910,checking whether C++ compiler accepts -Wformat -Wformat-security... yes
1701686295929,checking whether C++ compiler accepts -Wvla... yes
1701686295942,checking whether C++ compiler accepts -Wshadow-field... no
1701686295958,checking whether C++ compiler accepts -Wthread-safety... no
1701686295973,checking whether C++ compiler accepts -Wloop-analysis... no
1701686295991,checking whether C++ compiler accepts -Wredundant-decls... yes
1701686296011,checking whether C++ compiler accepts -Wunused-member-function... no
1701686296029,checking whether C++ compiler accepts -Wdate-time... yes
1701686296050,checking whether C++ compiler accepts -Wconditional-uninitialized... no
1701686296068,checking whether C++ compiler accepts -Wduplicated-branches... yes
1701686296085,checking whether C++ compiler accepts -Wduplicated-cond... yes
1701686296103,checking whether C++ compiler accepts -Wlogical-op... yes
1701686296121,checking whether C++ compiler accepts -Woverloaded-virtual... yes
1701686296138,checking whether C++ compiler accepts -Wsuggest-override... yes
1701686296161,checking whether C++ compiler accepts -Wunreachable-code-loop-increment... no
1701686296179,checking whether C++ compiler accepts -Wimplicit-fallthrough... yes
1701686296195,checking whether C++ compiler accepts -Wdocumentation... no
1701686296213,checking whether C++ compiler accepts -Wunused-parameter... yes
1701686296226,checking whether C++ compiler accepts -Wself-assign... no
1701686296245,checking whether C++ compiler accepts -fno-extended-identifiers... yes
1701686296260,checking whether C++ compiler accepts -fstack-reuse=none... yes
1701686296272,checking whether C++ compiler accepts -msse4.2... no
1701686296283,checking whether C++ compiler accepts -msse4.1... no
1701686296294,checking whether C++ compiler accepts -mavx -mavx2... no
1701686296305,checking whether C++ compiler accepts -msse4 -msha... no
1701686296317,checking whether C++ compiler accepts -mpclmul... no
1701686296340,checking for SSE4.2 intrinsics... no
1701686296356,checking for SSE4.1 intrinsics... no
1701686296370,checking for AVX2 intrinsics... no
1701686296385,checking for x86 SHA-NI intrinsics... no
1701686296405,checking whether C++ compiler accepts -march=armv8-a+crc+crypto... yes
1701686296423,checking whether C++ compiler accepts -march=armv8-a+crypto... yes
1701686296645,checking for ARMv8 CRC32 intrinsics... yes
1701686296869,checking for ARMv8 SHA-NI intrinsics... yes
1701686296951,checking whether byte ordering is bigendian... no
1701686296971,checking how to run the C preprocessor... gcc -E
1701686297002,checking whether gcc is Clang... no
1701686297044,"checking whether pthreads work with ""-pthread"" and ""-lpthread""... yes"
1701686297076,checking for joinable pthread attribute... PTHREAD_CREATE_JOINABLE
1701686297076,checking whether more special flags are required for pthreads... no
1701686297108,checking for PTHREAD_PRIO_INHERIT... yes
1701686297581,checking whether std::atomic can be used without link library... yes
1701686297582,checking for special C compiler options needed for large files... no
1701686297600,checking for _FILE_OFFSET_BITS value needed for large files... no
1701686297633,checking for g++ -std=c++20 options needed to detect all undeclared functions... none needed
1701686297679,checking whether strerror_r is declared... yes
1701686297701,checking whether strerror_r returns char *... yes
1701686297717,checking whether C++ compiler accepts -fPIC... yes
1701686297734,checking whether C++ compiler accepts -Wstack-protector... yes
1701686297750,checking whether C++ compiler accepts -fstack-protector-all... yes
1701686297758,checking whether C++ compiler accepts -fcf-protection=full... no
1701686297776,checking whether C++ compiler accepts -fstack-clash-protection... yes
1701686297792,checking whether C++ compiler accepts -mbranch-protection=bti... yes
1701686297801,checking whether C++ preprocessor accepts -D_FORTIFY_SOURCE=3... yes
1701686297811,checking whether C++ preprocessor accepts -U_FORTIFY_SOURCE... yes
1701686297835,"checking whether the linker accepts -Wl,--enable-reloc-section... no"
1701686297859,"checking whether the linker accepts -Wl,--dynamicbase... no"
1701686297882,"checking whether the linker accepts -Wl,--nxcompat... no"
1701686297907,"checking whether the linker accepts -Wl,--high-entropy-va... no"
1701686297947,"checking whether the linker accepts -Wl,-z,relro... yes"
1701686297986,"checking whether the linker accepts -Wl,-z,now... yes"
1701686298025,"checking whether the linker accepts -Wl,-z,separate-code... yes"
1701686298064,checking whether the linker accepts -fPIE -pie... yes
1701686298103,checking for sys/select.h... yes
1701686298143,checking for sys/prctl.h... yes
1701686298169,checking for sys/sysctl.h... no
1701686298194,checking for vm/vm_param.h... no
1701686298219,checking for sys/vmmeter.h... no
1701686298244,checking for sys/resources.h... no
1701686298271,checking whether getifaddrs is declared... yes
1701686298317,checking whether ifaddrs funcs can be used without link library... yes
1701686298345,checking whether freeifaddrs is declared... yes
1701686298389,checking whether ifaddrs funcs can be used without link library... yes
1701686298438,checking whether fork is declared... yes
1701686298484,checking whether setsid is declared... yes
1701686298530,checking whether pipe2 is declared... yes
1701686298572,checking for timingsafe_bcmp... no
1701686298587,checking for __builtin_clzl... yes
1701686298601,checking for __builtin_clzll... yes
1701686298625,checking for getmemoryinfo... yes
1701686298648,checking for mallopt M_ARENA_MAX... yes
1701686298668,checking for posix_fallocate... yes
1701686298684,checking for default visibility attribute... yes
1701686298699,checking for dllexport attribute... no
1701686299371,checking for thread_local support... yes
1701686299391,checking for gmtime_r... yes
1701686299414,checking for Linux getrandom function... yes
1701686299437,checking for getentropy via sys/random.h... yes
1701686299454,checking for sysctl... no
1701686299469,checking for sysctl KERN_ARND... no
1701686299498,checking for if type char equals int8_t... no
1701686299523,checking for fdatasync... yes
1701686299541,checking for F_FULLFSYNC... no
1701686299562,checking for O_CLOEXEC... yes
1701686299579,checking for __builtin_prefetch... yes
1701686299592,checking for _mm_prefetch... no
1701686299612,checking for strong getauxval support in the system headers... yes
1701686299659,checking for std::system... yes
1701686299684,checking for ::_wsystem... no
1701686299691,checking for Qt5Core >= 5.11.3... no
1701686299693,configure: WARNING: Qt5Core >= 5.11.3 not found; bitcoin-qt frontend will not be built
1701686299694,checking whether to build Bitcoin Core GUI... no
1701686299710,checking for sqlite3 >= 3.7.17... yes
1701686299710,checking whether to build wallet with support for sqlite... yes
1701686299723,"checking whether Userspace, Statically Defined Tracing tracepoints are supported... no"
1701686299768,checking for miniupnpc/miniupnpc.h... yes
1701686299814,checking for miniupnpc/upnpcommands.h... yes
1701686299861,checking for miniupnpc/upnperrors.h... yes
1701686299905,checking for upnpDiscover in -lminiupnpc... yes
1701686299914,checking whether miniUPnPc API version is supported... yes
1701686299941,checking for natpmp.h... no
1701686299966,checking for Boost headers >= 1.64.0 (106400)... yes
1701686299986,checking whether C++ preprocessor accepts -DBOOST_NO_CXX98_FUNCTION_BASE... yes
1701686306369,checking whether Boost.Process can be used... yes
1701686306374,checking for libevent >= 2.1.8... yes
1701686306379,checking for libevent_pthreads >= 2.1.8... yes
1701686306408,checking if evhttp_connection_get_peer expects const char**... no
1701686306413,checking for libmultiprocess... no
1701686306415,checking whether to build bitcoind... yes
1701686306415,checking whether to build bitcoin-cli... yes
1701686306415,checking whether to build bitcoin-tx... yes
1701686306415,checking whether to build bitcoin-wallet... yes
1701686306415,checking whether to build bitcoin-util... yes
1701686306415,checking whether to build experimental bitcoin-chainstate... no
1701686306416,checking whether to build libraries... yes
1701686306416,checking if ccache should be used... yes
1701686306549,checking whether C compiler accepts -fdebug-prefix-map=A=B... yes
1701686306560,checking whether C preprocessor accepts -fmacro-prefix-map=A=B... yes
1701686306560,checking if wallet should be enabled... yes
1701686306560,checking whether to build with support for UPnP... yes
1701686306560,checking whether to build with support for NAT-PMP... no
1701686306560,checking whether to build test_bitcoin... no
1701686306560,checking whether to reduce exports... no
1701686306587,checking that generated files are newer than configure... done
1701686306588,configure: creating ./config.status
1701686307056,config.status: creating libbitcoinconsensus.pc
1701686307064,config.status: creating Makefile
1701686307072,config.status: creating src/Makefile
1701686307108,config.status: creating doc/man/Makefile
1701686307119,config.status: creating share/setup.nsi
1701686307130,config.status: creating share/qt/Info.plist
1701686307141,config.status: creating test/config.ini
1701686307151,config.status: creating contrib/devtools/split-debug.sh
1701686307163,config.status: creating src/config/bitcoin-config.h
1701686307168,config.status: src/config/bitcoin-config.h is unchanged
1701686307209,config.status: executing depfiles commands
1701686307261,config.status: executing libtool commands
1701686307270,'"=== configuring in src/secp256k1 (/tmp/bitcoin/src/secp256k1)"
1701686307274,configure: running /bin/bash ./configure --disable-option-checking '--prefix=/usr/local'  '--enable-bench' '--disable-tests' '--disable-gui' '--disable-zmq' '--disable-fuzz' '--enable-fuzz-binary=no' 'BDB_LIBS=-L/tmp/bitcoin/depends/aarch64-unknown-linux-gnu/lib -ldb_cxx-4.8' 'BDB_CFLAGS=-I/tmp/bitcoin/depends/aarch64-unknown-linux-gnu/include' '--disable-shared' '--with-pic' '--enable-benchmark=no' '--enable-module-recovery' '--disable-module-ecdh' --cache-file=/dev/null --srcdir=.
1701686307402,checking build system type... aarch64-unknown-linux-gnu
1701686307402,checking host system type... aarch64-unknown-linux-gnu
1701686307408,checking for a BSD-compatible install... /usr/bin/install -c
1701686307410,checking whether build environment is sane... yes
1701686307416,checking for a race-free mkdir -p... /usr/bin/mkdir -p
1701686307416,checking for gawk... no
1701686307416,checking for mawk... mawk
1701686307421,checking whether make sets $(MAKE)... yes
1701686307425,checking whether make supports nested variables... yes
1701686307428,checking whether make supports nested variables... (cached) yes
1701686307428,checking for gcc... gcc
1701686307472,checking whether the C compiler works... yes
1701686307472,checking for C compiler default output file name... a.out
1701686307497,checking for suffix of executables... 
1701686307525,checking whether we are cross compiling... no
1701686307540,checking for suffix of object files... o
1701686307553,checking whether the compiler supports GNU C... yes
1701686307566,checking whether gcc accepts -g... yes
1701686307609,checking for gcc option to enable C11 features... none needed
1701686307633,checking whether gcc understands -c and -o together... yes
1701686307638,checking whether make supports the include directive... yes (GNU style)
1701686307661,checking dependency style of gcc... gcc3
1701686307683,checking dependency style of gcc... gcc3
1701686307683,checking for ar... ar
1701686307697,checking the archiver (ar) interface... ar
1701686307699,checking how to print strings... printf
1701686307702,checking for a sed that does not truncate output... /usr/bin/sed
1701686307704,checking for grep that handles long lines and -e... /usr/bin/grep
1701686307705,checking for egrep... /usr/bin/grep -E
1701686307706,checking for fgrep... /usr/bin/grep -F
1701686307709,checking for ld used by gcc... /usr/bin/ld
1701686307711,checking if the linker (/usr/bin/ld) is GNU ld... yes
1701686307712,checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B
1701686307728,checking the name lister (/usr/bin/nm -B) interface... BSD nm
1701686307728,checking whether ln -s works... yes
1701686307731,checking the maximum length of command line arguments... 1966080
1701686307733,checking how to convert aarch64-unknown-linux-gnu file names to aarch64-unknown-linux-gnu format... func_convert_file_noop
1701686307733,checking how to convert aarch64-unknown-linux-gnu file names to toolchain format... func_convert_file_noop
1701686307733,checking for /usr/bin/ld option to reload object files... -r
1701686307734,checking for objdump... objdump
1701686307734,checking how to recognize dependent libraries... pass_all
1701686307734,checking for dlltool... no
1701686307734,"checking how to associate runtime and link libraries... printf %s
"
1701686307754,checking for archiver @FILE support... @
1701686307754,checking for strip... strip
1701686307754,checking for ranlib... ranlib
1701686307803,checking command to parse /usr/bin/nm -B output from gcc object... ok
1701686307804,checking for sysroot... no
1701686307807,checking for a working dd... /usr/bin/dd
1701686307810,checking how to truncate binary pipes... /usr/bin/dd bs=4096 count=1
1701686307810,checking for mt... no
1701686307812,checking if : is a manifest tool... no
1701686307829,checking for stdio.h... yes
1701686307850,checking for stdlib.h... yes
1701686307872,checking for string.h... yes
1701686307894,checking for inttypes.h... yes
1701686307917,checking for stdint.h... yes
1701686307940,checking for strings.h... yes
1701686307962,checking for sys/stat.h... yes
1701686307984,checking for sys/types.h... yes
1701686308010,checking for unistd.h... yes
1701686308037,checking for dlfcn.h... yes
1701686308040,checking for objdir... .libs
1701686308098,checking if gcc supports -fno-rtti -fno-exceptions... no
1701686308098,checking for gcc option to produce PIC... -fPIC -DPIC
1701686308113,checking if gcc PIC flag -fPIC -DPIC works... yes
1701686308172,checking if gcc static flag -static works... yes
1701686308191,checking if gcc supports -c -o file.o... yes
1701686308191,checking if gcc supports -c -o file.o... (cached) yes
1701686308198,checking whether the gcc linker (/usr/bin/ld) supports shared libraries... yes
1701686308236,checking dynamic linker characteristics... GNU/Linux ld.so
1701686308236,checking how to hardcode library paths into programs... immediate
1701686308237,checking whether stripping libraries is possible... yes
1701686308237,checking if libtool supports shared libraries... yes
1701686308237,checking whether to build shared libraries... no
1701686308237,checking whether to build static libraries... yes
1701686308250,checking if gcc supports -Werror... yes
1701686308262,checking if gcc supports -std=c89 -pedantic -Wno-long-long -Wnested-externs -Wshadow -Wstrict-prototypes -Wundef... yes
1701686308274,checking if gcc supports -Wno-overlength-strings... yes
1701686308287,checking if gcc supports -Wall... yes
1701686308299,checking if gcc supports -Wno-unused-function... yes
1701686308312,checking if gcc supports -Wextra... yes
1701686308324,checking if gcc supports -Wcast-align... yes
1701686308337,checking if gcc supports -Wcast-align=strict... yes
1701686308356,checking if gcc supports -Wconditional-uninitialized... no
1701686308372,checking if gcc supports -Wreserved-identifier... no
1701686308383,checking if gcc supports -fvisibility=hidden... yes
1701686308397,checking for valgrind support... 
1701686308415,checking for x86_64 assembly availability... no
1701686308434,checking that generated files are newer than configure... done
1701686308434,configure: creating ./config.status
1701686308749,config.status: creating Makefile
1701686308758,config.status: creating libsecp256k1.pc
1701686308766,config.status: executing depfiles commands
1701686308777,config.status: executing libtool commands
1701686308784,Build Options:
1701686308784,  with external callbacks = no
1701686308784,  with benchmarks         = no
1701686308784,  with tests              = no
1701686308784,  with ctime tests        = no
1701686308784,  with coverage           = no
1701686308784,  with examples           = no
1701686308784,  module ecdh             = no
1701686308784,  module recovery         = yes
1701686308784,  module extrakeys        = yes
1701686308784,  module schnorrsig       = yes
1701686308784,  module ellswift         = yes
1701686308784,  asm                     = no
1701686308784,  ecmult window size      = 15
1701686308784,  ecmult gen prec. bits   = 4
1701686308784,  valgrind                = no
1701686308784,  CC                      = gcc
1701686308784,  CPPFLAGS                = 
1701686308784,  SECP_CFLAGS             = -O2  -std=c89 -pedantic -Wno-long-long -Wnested-externs -Wshadow -Wstrict-prototypes -Wundef -Wno-overlength-strings -Wall -Wno-unused-function -Wextra -Wcast-align -Wcast-align=strict -fvisibility=hidden 
1701686308784,  CFLAGS                  = -g -O2
1701686308784,  LDFLAGS                 = 
1701686308812,Options used to compile and link:
1701686308812,  external signer = yes
1701686308812,  multiprocess    = no
1701686308812,  with libs       = yes
1701686308812,  with wallet     = yes
1701686308812,    with sqlite   = yes
1701686308812,    with bdb      = yes
1701686308812,  with gui / qt   = no
1701686308812,  with zmq        = no
1701686308812,  with test       = no
1701686308812,  with fuzz binary = no
1701686308812,  with bench      = yes
1701686308812,  with upnp       = yes
1701686308812,  with natpmp     = no
1701686308812,  use asm         = yes
1701686308812,  USDT tracing    = no
1701686308812,  sanitizers      = 
1701686308812,  debug enabled   = no
1701686308812,  gprof enabled   = no
1701686308812,  werror          = no
1701686308812,  LTO             = no
1701686308812,  target os       = linux-gnu
1701686308812,  build os        = linux-gnu
1701686308812,  CC              = /usr/bin/ccache gcc
1701686308812,  CFLAGS          = -pthread -g -O2
1701686308812,  CPPFLAGS        =  -fmacro-prefix-map=$(abs_top_srcdir)=.  -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=3  -DHAVE_BUILD_INFO 
1701686308812,  CXX             = /usr/bin/ccache g++ -std=c++20
1701686308812,  CXXFLAGS        =   -fdebug-prefix-map=$(abs_top_srcdir)=.  -Wstack-protector -fstack-protector-all -fstack-clash-protection -mbranch-protection=bti  -Wall -Wextra -Wformat -Wformat-security -Wvla -Wredundant-decls -Wdate-time -Wduplicated-branches -Wduplicated-cond -Wlogical-op -Woverloaded-virtual -Wsuggest-override -Wimplicit-fallthrough  -Wno-unused-parameter    -fno-extended-identifiers -fstack-reuse=none -g -O2
1701686308812,"  LDFLAGS         =  -lpthread  -Wl,-z,relro -Wl,-z,now -Wl,-z,separate-code -pie   "
1701686308812,  AR              = /usr/bin/ar
1701686308812,  ARFLAGS         = cr

@maflcko
Copy link
Member

maflcko commented Dec 5, 2023

Same bench result here, compiling with clang, both master and this pull. My command:

$ grep 'make_bitcoin_core=' ~/.bashrc 
alias make_bitcoin_core="      ./autogen.sh && CC=clang CXX=clang++ ./configure -q --enable-c++20 --enable-experimental-util-chainstate --with-experimental-kernel-lib                                 && make clean && make -j $(nproc)"

@fanquake
Copy link
Member Author

fanquake commented Dec 6, 2023

Yes that is correct. Also the compiler installed is g++-11.

It would be useful to have compiler/version used/C{XX} flags/configure options (even if just mentioning it's the default) displayed next to the benchmarking information, as I assume this will always be one of the first questions asked in relation to benchmark output.

@fanquake
Copy link
Member Author

fanquake commented Dec 6, 2023

Rebased on master/the current C++20 PR, which should improve the CI here.

@aureleoules does the SonarCloud output always run on the latest changes? I've dropped all the inline usage from the constexpr functions in endian.h, but https://corecheck.dev/bitcoin/bitcoin/pulls/28674 still shows all the output about using inline with constexpr?

Maybe something was being cached, as this seems to have updated now.

@aureleoules
Copy link
Contributor

Maybe something was being cached, as this seems to have updated now.

Yes it may take some time for the sonarcloud worker to finish, as well as sonarcloud to refresh the results.
I'll improve the UX for that 😅

@theuni
Copy link
Member

theuni commented Dec 6, 2023

Hmm, playing around with this code on godbolt, it hardly ever compiles down to bswaps. I'm not sure what changed from when I was initially investigating. Will have a look.

@theuni
Copy link
Member

theuni commented Dec 6, 2023

Hmm, playing around with this code on godbolt, it hardly ever compiles down to bswaps. I'm not sure what changed from when I was initially investigating. Will have a look.

Ok, interesting, the problem is pretty easy to see here: https://godbolt.org/z/d5EGs8Ybs

GCC and MSVC both do the same thing: the optimization is done on the small static function, but it doesn't carry through when inlined. clang's output looks as expected. That explains why my test cases look good, but real world performance took a nosedive.

Also interesting, switching the function to a macro doesn't change anything.

@aureleoules How hard would it be to switch to using clang for a test-run to compare?

I wonder if this is something GCC would be interested in taking a look at. clang can obviously do the right thing.

@maflcko
Copy link
Member

maflcko commented Dec 6, 2023

clang can obviously do the right thing.

Are you sure? When I tried yesterday, it did not (#28674 (comment))

@theuni
Copy link
Member

theuni commented Dec 6, 2023

clang can obviously do the right thing.

Are you sure?

Absolutely not! I'm making this up as I go.
Edit: See the godbolt link for comparison though.

When I tried yesterday, it did not (#28674 (comment))

Thanks, I guess that's the test I just asked for.

Ok, I'll keep poking. But it seems like this approach likely won't work :(

@maflcko
Copy link
Member

maflcko commented Dec 6, 2023

Ok, I'll keep poking. But it seems like this approach likely won't work :(

What about hiding the function inside a compilation unit, where they are turned into bswap, and then rely on LTO to replace the call internal_bswap_64 with bswap? (Haven't tried this)

- use of std::endian means that we no longer have to rely on configure checks
and compiler macros for compile-time endianness checks.
- countl_zero replaces the need for __builtin_clz*
- trust the compiler to optimize bswaps and drop platform-dependent
  implementations
++ret;
}
return ret;
return (sizeof(x) * 8 )- std::countl_zero(x);
Copy link
Member

@maflcko maflcko Dec 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could split this up from the other changes?

nit: Also missing clang-format on this line.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there are essentially 3 changes here:

  • using std::endian
  • using std::countl_zero
  • removing the non-standard byteswaps

The first two are useful regardless, though obviously much less so without doing the third.

I'll update this to break up the commits.

As an alternative to what's here, I'm going to try switching our byteswaps to compiler builtins for gcc/clang/msvc. That still leaves us with compiler-specific behavior, but that would at least be an improvement over the current required autoconf/cmake tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on what happens here, given we only use CountBits in two places, if this is going to basically become a std lib call, we could just use it directly where needed, and remove our CountBits unit/fuzz tests, so that we aren't nearly unit testing / fuzzing the standard library.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer to leave this with its comment. Note that we don't use std::counl_zero directly, we want kinda the opposite: (sizeof(x) * 8 )- std::countl_zero(x);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worked on a branch last week that left me very confused: https://github.com/theuni/bitcoin/commits/pr28674-rebase2/

It's the last commit (which I would expect to be a no-op) which causes the slowdown. For whatever reason, the built-in functions like be64toh are much faster than my hand-written internal ones which implement the same code that glibc does in its headers! I'm still scratching my head trying to work out what's happening.

But for now, now that we have c++20, I'll go ahead and PR the first two parts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer to leave this with its comment. Note that we don't use std::counl_zero directly, we want kinda the opposite: (sizeof(x) * 8 )- std::countl_zero(x);

Actually, as it turns out, std::bit_width is exactly what we want (and it's even described in terms of countl_zero). I'll see if it works to use your suggestion with that function instead.

@theuni
Copy link
Member

theuni commented Dec 8, 2023

Ok, I'll keep poking. But it seems like this approach likely won't work :(

What about hiding the function inside a compilation unit, where they are turned into bswap, and then rely on LTO to replace the call internal_bswap_64 with bswap? (Haven't tried this)

Hah, that's fun. Too fun in fact, so I'm not going to try because it would make things even more complicated if it worked :p
Because even if it worked, we couldn't possibly require LTO as a means of achieving basic performance.

@theuni
Copy link
Member

theuni commented Dec 8, 2023

Hmm. I have something working, but as usual, MSVC is the odd case. Looking at the current code, it's not clear to me how MSVC ends up doing byteswaps because we don't have any detection for it. But I also doubt that it's finding the functions in <byteswap.h>.

So I think it's possible that MSVC is always taking the slow path? If that's the case, fixing this here would give it a big speedup.

@hebasto could you check with MSVC? It should be easy to check with master with something like:

diff --git a/src/compat/byteswap.h b/src/compat/byteswap.h
index 9ee71ef267d..49e17c68828 100644
--- a/src/compat/byteswap.h
+++ b/src/compat/byteswap.h
@@ -43,6 +43,7 @@ inline uint32_t bswap_32(uint32_t x)
 #if HAVE_DECL_BSWAP_64 == 0
 inline uint64_t bswap_64(uint64_t x)
 {
+#error using slow path
      return (((x & 0xff00000000000000ull) >> 56)
           | ((x & 0x00ff000000000000ull) >> 40)
           | ((x & 0x0000ff0000000000ull) >> 24)

hebasto added a commit to hebasto/bitcoin that referenced this pull request Dec 8, 2023
@hebasto
Copy link
Member

hebasto commented Dec 8, 2023

@theuni

#define HAVE_DECL_BSWAP_64 0

@theuni
Copy link
Member

theuni commented Dec 8, 2023

@theuni

#define HAVE_DECL_BSWAP_64 0

Thanks! theuni@bbc7203 ready for PR if it indeed provides a speedup. I don't see how it couldn't :)

return x ? 8 * sizeof(unsigned long) - __builtin_clzl(x) : 0;
}
#endif
#if HAVE_BUILTIN_CLZLL
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can remove this from configure.ac now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it. Will do in the updated PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be possible after we pull the changes from minisketch: bitcoin-core/minisketch#80.

@fanquake fanquake closed this Jan 17, 2024
fanquake added a commit that referenced this pull request Mar 1, 2024
86b7f28 serialization: use internal endian conversion functions (Cory Fields)
432b18c serialization: detect byteswap builtins without autoconf tests (Cory Fields)
297367b crypto: replace CountBits with std::bit_width (Cory Fields)
52f9bba crypto: replace non-standard CLZ builtins with c++20's bit_width (Cory Fields)

Pull request description:

  This replaces #28674, #29036, and #29057. Now ready for testing and review.

  Replaces platform-specific endian and byteswap functions. This is especially useful for kernel, as it means that our deep serialization code no longer requires bitcoin-config.h.

  I apologize for the size of the last commit, but it's hard to avoid making those changes at once.

  All platforms now use our internal functions rather than libc or platform-specific ones, with the exception of MSVC.

  Sadly, benchmarking showed that not all compilers are capable of detecting and optimizing byteswap functions, so compiler builtins are instead used where possible. However, they're now detected via macros rather than autoconf checks.

  This[ matches how libc++ implements std::byteswap for c++23](https://github.com/llvm/llvm-project/blob/main/libcxx/include/__bit/byteswap.h#L26).

  I suggest we move/rename `compat/endian.h`, but I left that out of this PR to avoid bikeshedding.

  #29057 pointed out some irregularities in benchmarks. After messing with various compilers and configs for a few weeks with these changes, I'm of the opinion that we can't win on every platform every time, so we should take the code that makes sense going forward. That said, if any real-world slowdowns are caused here, we should obviously investigate.

ACKs for top commit:
  maflcko:
    ACK 86b7f28 📘
  fanquake:
    ACK 86b7f28 - we can finish pruning out the __builtin_clz* checks/usage once the minisketch code has been updated. This is more good cleanup pre-CMake & for the kernal.

Tree-SHA512: 715a32ec190c70505ffbce70bfe81fc7b6aa33e376b60292e801f60cf17025aabfcab4e8c53ebb2e28ffc5cf4c20b74fe3dd8548371ad772085c13aec8b7970e
@fanquake fanquake deleted the cxx_20_endian branch March 8, 2024 10:17
@bitcoin bitcoin locked and limited conversation to collaborators Mar 8, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants