optimization: Optimize IsSpace function for common non-whitespace characters #29602

l0rinc · 2024-03-08T16:11:47Z

The IsSpace function has been optimized for the more common case where a character is not whitespace. Previously, the function checked each whitespace character individually, which was not efficient for non-whitespace inputs as it required multiple comparisons.
This method is often used for parsing various inputs more flexibly, see usages.
This results in an additional ~2% performance improvement for base conversions (see related hexadecimal and base58 optimizations).

The updated IsSpace function now first checks if the character is less than or equal to ' ' (the space character, which has the highest ASCII value among the whitespace characters). This single condition can quickly determine that most characters (i.e. letters, numbers) are not whitespace, thus short-circuiting the evaluation for the most common case.

Otherwise the function performs additional checks to see if it is either a space character or within the range of horizontal tab to carriage return ('\t' to '\r'), which are the remaining whitespace characters in the ASCII table.

This change assumes that most calls to IsSpace are for non-whitespace characters, as can be inferred from the usage patterns where IsSpace is often used in loops that parse strings until a non-whitespace character is encountered.

To make sure the changes keep the previous functionality, I've added an exhaustive unit test for it.

DrahtBot · 2024-03-08T16:11:49Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type	Reviewers
Concept ACK	laanwj

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

maflcko · 2024-03-15T12:18:42Z

Is this used in a hot path that is relevant, to warrant the changes?

l0rinc · 2024-03-15T12:29:34Z

Is this used in a hot path that is relevant, to warrant the changes?

It's used in base 58 and 16 conversions while iterating over each character of the input.
We're expecting most of them to not be spaces, it would make sense to optimize for that scenario and exit early.

maflcko · 2024-03-15T12:32:46Z

Yes, but base58 isn't used in a hot path, where optimizing it would result in a noticeable difference. (When sending an address over RPC, the remaining overhead is too large to notice a speedup of a few nanoseconds)

l0rinc · 2024-03-15T12:38:56Z

It's a small improvement, indeed, but it eliminates useless work, done frequently.
The exhaustive tests make sure the behavior is exactly the same as before, so while it may be considered low-reward, it should be low-risk as well.
Is there any area (testing, documentation, code) that you would like me to improve to increase the risk/reward ratio?

laanwj · 2024-04-09T07:35:32Z

Is this really a bottleneck? A benchmark would be useful to see if this isn't something the compiler already does.

ACK on the test.

l0rinc · 2024-04-09T10:37:43Z

Thanks for the review @laanwj. The gain is small, but measurable:

make && ./src/bench/bench_bitcoin --filter=HexParse --min-time=10000

Before:

|           ns/base16 |            base16/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|                0.68 |    1,461,990,359.30 |    0.0% |     10.83 | `HexParse`
|                0.68 |    1,461,442,745.57 |    0.1% |     10.83 | `HexParse`
|                0.68 |    1,461,262,276.18 |    0.1% |     10.83 | `HexParse`

After:

|           ns/base16 |            base16/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|                0.67 |    1,495,985,893.94 |    0.0% |     10.82 | `HexParse`
|                0.67 |    1,494,916,055.36 |    0.1% |     10.82 | `HexParse`
|                0.67 |    1,494,545,662.66 |    0.0% |     10.82 | `HexParse`

which results in a speedup of ~2% for 130 non-whitespace hexadecimal characters:

(1,495,985,893.94 / 1,461,990,359.30) = 2.3%

…coding 07f6417 Reduce memory copying operations in bech32 encode (Lőrinc) d5ece3c Reserve hrp memory in Decode and LocateErrors (Lőrinc) Pull request description: Started optimizing the base conversions in [TryParseHex](#29458), [Base58](#29473) and [IsSpace](#29602) - this is the next step. Part of this change was already merged in #30047, which made decoding `~26%` faster. Here I've reduced the memory reallocations and copying operations in bech32 encode, making it `~15%` faster. > make && ./src/bench/bench_bitcoin --filter='Bech32Encode' --min-time=1000 Before: ``` | ns/byte | byte/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 19.97 | 50,074,562.72 | 0.1% | 1.06 | `Bech32Encode` ``` After: ``` | ns/byte | byte/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 17.33 | 57,687,668.20 | 0.1% | 1.10 | `Bech32Encode` ``` ACKs for top commit: josibake: ACK 07f6417 sipa: utACK 07f6417 achow101: ACK 07f6417 Tree-SHA512: 511885217d044ad7ef2bdf9203b8e0b94eec8b279bc193bb7e63e29ab868df6d21e9e4c7a24390358e1f9c131447ee42039df72edcf1e2b11e1856eb2b3e10dd

The IsSpace function has been optimized for the more common case where a character is not whitespace. Previously, the function checked each whitespace character individually, which was not efficient for non-whitespace inputs as it required multiple comparisons. The updated IsSpace function now first checks if the character is less than or equal to ' ' (the space character, which has the highest ASCII value among the whitespace characters). This single condition can quickly determine that most characters (those with ASCII values greater than ' ') are not whitespace, thus short-circuiting the evaluation for the most common case. If the character is less than or equal to ' ', the function performs additional checks to see if it is either a space character or within the range of horizontal tab to carriage return ('\t' to '\r'), which are the remaining whitespace characters in the ASCII table. This change assumes that most calls to IsSpace are for non-whitespace characters, as can be inferred from the usage patterns where IsSpace is often used in loops that parse strings until a non-whitespace character is encountered.

Github-Pull: bitcoin#29602 Rebased-From: 783e839

The IsSpace function has been optimized for the more common case where a character is not whitespace. Previously, the function checked each whitespace character individually, which was not efficient for non-whitespace inputs as it required multiple comparisons. The updated IsSpace function now first checks if the character is less than or equal to ' ' (the space character, which has the highest ASCII value among the whitespace characters). This single condition can quickly determine that most characters (those with ASCII values greater than ' ') are not whitespace, thus short-circuiting the evaluation for the most common case. If the character is less than or equal to ' ', the function performs additional checks to see if it is either a space character or within the range of horizontal tab to carriage return ('\t' to '\r'), which are the remaining whitespace characters in the ASCII table. This change assumes that most calls to IsSpace are for non-whitespace characters, as can be inferred from the usage patterns where IsSpace is often used in loops that parse strings until a non-whitespace character is encountered. Github-Pull: bitcoin#29602 Rebased-From: f11c063

DrahtBot added the Refactoring label Mar 8, 2024

l0rinc marked this pull request as ready for review March 8, 2024 17:21

l0rinc mentioned this pull request Mar 8, 2024

optimization: Speed up Base58 encoding/decoding by 400%/200% via preliminary byte packing #29473

Closed

l0rinc force-pushed the paplorinc/is_space_optimization branch 3 times, most recently from c15b5e2 to 210a280 Compare March 9, 2024 13:40

This was referenced Mar 9, 2024

refactor: Reduce memory copying operations in bech32 encoding #29607

Merged

refactor: Preallocate result in TryParseHex to avoid resizing #29458

Merged

l0rinc force-pushed the paplorinc/is_space_optimization branch from 210a280 to 3a47a61 Compare April 9, 2024 10:38

l0rinc force-pushed the paplorinc/is_space_optimization branch from 3a47a61 to e4e4c36 Compare May 11, 2024 19:58

l0rinc force-pushed the paplorinc/is_space_optimization branch 3 times, most recently from e0435d3 to 68829a4 Compare May 29, 2024 08:36

DrahtBot added the CI failed label Jun 21, 2024

l0rinc added 2 commits June 21, 2024 18:35

Add exhaustive range test for IsSpace

783e839

l0rinc force-pushed the paplorinc/is_space_optimization branch from 68829a4 to f11c063 Compare June 21, 2024 16:36

DrahtBot removed the CI failed label Jun 21, 2024

l0rinc changed the title ~~refactor: Optimize IsSpace function for common non-whitespace characters~~ optimization: Optimize IsSpace function for common non-whitespace characters Jun 23, 2024

l0rinc closed this Aug 18, 2024

luke-jr pushed a commit to luke-jr/bitcoin that referenced this pull request Jun 6, 2025

Add exhaustive range test for IsSpace

7dbb256

Github-Pull: bitcoin#29602 Rebased-From: 783e839

bitcoin locked and limited conversation to collaborators Aug 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

optimization: Optimize IsSpace function for common non-whitespace characters #29602

optimization: Optimize IsSpace function for common non-whitespace characters #29602

Uh oh!

l0rinc commented Mar 8, 2024 •

edited

Loading

Uh oh!

DrahtBot commented Mar 8, 2024 •

edited

Loading

Uh oh!

maflcko commented Mar 15, 2024

Uh oh!

l0rinc commented Mar 15, 2024

Uh oh!

maflcko commented Mar 15, 2024

Uh oh!

l0rinc commented Mar 15, 2024 •

edited

Loading

Uh oh!

laanwj commented Apr 9, 2024

Uh oh!

l0rinc commented Apr 9, 2024

Uh oh!

Uh oh!

optimization: Optimize IsSpace function for common non-whitespace characters #29602

optimization: Optimize IsSpace function for common non-whitespace characters #29602

Uh oh!

Conversation

l0rinc commented Mar 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DrahtBot commented Mar 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Coverage

Reviews

Uh oh!

maflcko commented Mar 15, 2024

Uh oh!

l0rinc commented Mar 15, 2024

Uh oh!

maflcko commented Mar 15, 2024

Uh oh!

l0rinc commented Mar 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

laanwj commented Apr 9, 2024

Uh oh!

l0rinc commented Apr 9, 2024

Uh oh!

Uh oh!

l0rinc commented Mar 8, 2024 •

edited

Loading

DrahtBot commented Mar 8, 2024 •

edited

Loading

l0rinc commented Mar 15, 2024 •

edited

Loading