Skip to content

Conversation

pinheadmz
Copy link
Member

@pinheadmz pinheadmz commented Mar 13, 2025

This is a major component of removing libevent as a dependency of the project

It is based on #30988 but only in the sense that it copies the Sockman class introduced in that PR. The p2p / Connman refactor isn't needed for HTTP and therefore this branch can be reviewed and merged independently of the p2p changes.

Commit strategy:

  • Import sockman.{h,cpp} from Split CConnman #30988
  • Assert current behavior of HTTP with additional functional tests, including copying from libevent's tests
  • Implement a few helper functions for strings, timestamps, etc needed by HTTP protocol
  • Isolate the existing libevent-based HTTP server in a namespace http_libevent
  • Implement HTTP in a new namespace http_bitcoin (the namespace manages duplicate HTTPRequest definitions, etc)
  • Switch bitcoind from the libevent server to the new server
  • Clean up (delete http_libevent)

I am currently seeing about a 10% speed up in the functional tests on my own arm/macos machine.

Integration testing:

I am testing the new HTTP server by forking projects that integrate with bitcoin via HTTP and running their integration tests with bitcoind built from this branch (on Github actions). I will continue adding integrations over time, and re-running these CI tests as this branch gets rebased:

@DrahtBot
Copy link
Contributor

DrahtBot commented Mar 13, 2025

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32061.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK romanz
Concept ACK laanwj, fjahr, w0xlt
Approach ACK vasild

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #32862 (rpc: use CScheduler for relocking wallet and remove RPCTimer by pinheadmz)
  • #32747 (Introduce SockMan ("lite"): low-level socket handling for HTTP by pinheadmz)
  • #32522 (util: C++20 ToIntegral() Improvement by w0xlt)
  • #32297 (bitcoin-cli: Add -ipcconnect option by ryanofsky)
  • #31929 (http: Make server shutdown more robust by hodlinator)
  • #31672 (rpc: add cpu_load to getpeerinfo by vasild)
  • #30988 (Split CConnman by vasild)
  • #29641 (scripted-diff: Use LogInfo over LogPrintf [WIP, NOMERGE, DRAFT] by maflcko)
  • #27731 (Prevent file descriptor exhaustion from too many RPC calls by fjahr)
  • #26812 (test: add end-to-end tests for CConnman and PeerManager by vasild)
  • #26022 (Add util::ResultPtr class by ryanofsky)
  • #25665 (refactor: Add util::Result failure values, multiple error and warning messages by ryanofsky)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

LLM Linter (✨ experimental)

Possible typos and grammar issues:

  • “an unique id” -> “a unique id” [‘unique’ begins with a consonant sound, so use “a” instead of “an.”]

drahtbot_id_4_m

@pinheadmz pinheadmz mentioned this pull request Mar 13, 2025
@pinheadmz pinheadmz marked this pull request as draft March 13, 2025 19:37
@DrahtBot
Copy link
Contributor

🚧 At least one of the CI tasks failed.
Debug: https://github.com/bitcoin/bitcoin/runs/38735177073

Hints

Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:

  • Possibly due to a silent merge conflict (the changes in this pull request being
    incompatible with the current code in the target branch). If so, make sure to rebase on the latest
    commit of the target branch.

  • A sanitizer issue, which can only be found by compiling with the sanitizer and running the
    affected test.

  • An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

@laanwj
Copy link
Member

laanwj commented Mar 14, 2025

Concept ACK, nice work

@vasild
Copy link
Contributor

vasild commented Mar 14, 2025

Concept ACK

@fjahr
Copy link
Contributor

fjahr commented Mar 14, 2025

Concept ACK

My understanding from the low-level networking discussion at CoreDev was that this wouldn't build on top of sockman. I guess the devil is in the details but can you address that in what sense the current approach follows what was discussed there? Thanks!

@pinheadmz
Copy link
Member Author

My understanding from the low-level networking discussion at CoreDev was that this wouldn't build on top of sockman. I guess the devil is in the details but can you address that in what sense the current approach follows what was discussed there? Thanks!

Sure, by coredev I had already written most of this implementation (based on sockman) but the performance was bad, and that was part of the motivation behind the deep-dive talk. However, by the end of the week I had reviewed that code in person with smart attendees and not only improved the performance of my code but started to improve performance vs master branch as well! Those updates came in the days just after the deep-dive discussion.

SOME kind of sockman is needed to replace libevent. The one @vasild wrote does actually seem to work well for this purpose as well as for p2p, and it would be "nice" to only have to maintain one I/O loop structure in bitcoind. @theuni is investigating how a sockman for http could be optimized if it had no other purpose, and I think that is the kind of feedback that will help us decide which path to take.

@vasild
Copy link
Contributor

vasild commented Mar 19, 2025

SOME kind of sockman is needed to replace libevent ... it would be "nice" to only have to maintain one I/O loop structure in bitcoind.

💯

pinheadmz and others added 13 commits June 21, 2025 17:13
See https://www.rfc-editor.org/rfc/rfc7230#section-6.3.2

> A server MAY process a sequence of pipelined requests in
  parallel if they all have safe methods (Section 4.2.1 of [RFC7231]),
  but it MUST send the corresponding responses in the same order that
  the requests were received.

We choose NOT to process requests in parallel. They are executed in
the order recevied as well as responded to in the order received.
This prevents race conditions where old state may get sent in response
to requests that are very quick to process but were requested later on
in the queue.
This is a refactor to prepare for matching the API of HTTPRequest
definitions in both namespaces http_bitcoin and http_libevent. In
particular, to provide a consistent return type for GetRequestMethod()
in both classes.
These methods are called by http_request_cb() and are present in the
original http_libevent::HTTPRequest.
The original function is passed to libevent as a callback when HTTP
requests are received and processed. It wrapped the libevent request
object in a http_libevent::HTTPRequest and then handed that off to
bitcoin for basic checks and finally dispatch to worker threads.

In this commit we split the function after the
http_libevent::HTTPRequest is created, and pass that object to a new
function that maintains the logic of checking and dispatching.

This will be the merge point for http_libevent and http_bitcoin,
where HTTPRequest objects from either namespace have the same
downstream lifecycle.
The original function was already naturally split into two chunks:
First, we parse and validate the users' RPC configuration for IPs and
ports. Next we bind libevent's http server to the appropriate
endpoints.

This commit splits these chunks into two separate functions, leaving
the argument parsing in the common space of the module and moving the
libevent-specific binding into the http_libevent namespace.

A future commit will implement http_bitcoin::HTTPBindAddresses to
bind the validate list of endpoints by the new HTTP server.
@pinheadmz pinheadmz force-pushed the http-rewrite-13march2025 branch from aa89c97 to cdb13c9 Compare June 22, 2025 00:13
@pinheadmz pinheadmz force-pushed the http-rewrite-13march2025 branch from cdb13c9 to e531a7c Compare June 22, 2025 21:46
Copy link
Member Author

@pinheadmz pinheadmz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebased to address comments by @vasild.

Also switched from #30988 ("Split CConnman") to #32747 ("SockMan lite") for the back end.

That also included a rebase on master where #32408 was merged, cherry picking some commits from this PR.

conn = http.client.HTTPConnection(urlNode2.hostname, urlNode2.port)
conn.connect()
sock = conn.sock
sock.settimeout(1)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in #32408

# The server should not respond to the fast, second request
# until the (very) slow first request has been handled:
res = sock.recv(1024)
assert not res
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in #32408

Comment on lines 162 to 169
body_chunked = [
b'{"method": "submitblock", "params": ["',
b'0A' * 1000000,
b'0B' * 1000000,
b'0C' * 1000000,
b'0D' * 1000000,
b'"]}'
]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in #32408

headers=headers_chunked,
encode_chunked=True)
out1 = conn.getresponse().read()
assert out1 == b'{"result":"high-hash","error":null}\n'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in #32408

Comment on lines 211 to 217
# definitely closed
try:
conn.request('GET', '/')
conn.getresponse()
# macos/linux windows
except (ConnectionResetError, ConnectionAbortedError):
pass
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in #32408

Comment on lines 704 to 716
void HTTPServer::CloseConnectionInternal(std::shared_ptr<HTTPClient>& client)
{
event_free(ev);
if (CloseConnection(client->m_node_id)) {
LogDebug(BCLog::HTTP, "Disconnected HTTP client %s (id=%d)\n", client->m_origin, client->m_node_id);
} else {
LogDebug(BCLog::HTTP, "Failed to disconnect non-existent HTTP client %s (id=%d)\n", client->m_origin, client->m_node_id);
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes thanks

src/httpserver.h Outdated
Comment on lines 177 to 189
// Set to true when we receive request data and set to false once m_send_buffer is cleared.
// Checked during DisconnectClients(). All of these operations take place in the Sockman I/O loop,
// however it may get set my a worker thread during an "optimistic send".
std::atomic_bool m_prevent_disconnect{false};
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

src/httpserver.h Outdated
std::string Stringify() const;

private:
std::map<std::string, std::string, util::CaseInsensitiveComparator> m_map;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great feedback thanks. I made this change by first editing the "string: add CaseInsensitiveComparator" commit which is now "string: add CaseInsensitive{KeyEqual, Hash} for unordered map" and implements the two operators needed for the unordered map.

Comment on lines 314 to 318
std::optional<std::string_view> HTTPHeaders::Find(const std::string key) const
{
if (!InitHTTPAllowList())
return false;
const auto it = m_map.find(key);
if (it == m_map.end()) return std::nullopt;
return std::string_view(it->second);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah that is another great catch thank you. Glad I wrote that as an extra commit! I'll pop it off, it was added in response to this review comment: #32061 (comment)

Comment on lines 453 to 462
auto content_length_value{m_headers.Find("Content-Length")};
if (!content_length_value) return true;

uint64_t content_length;
if (!ParseUInt64(content_length_value.value(), &content_length)) throw std::runtime_error("Cannot parse Content-Length value");

// Not enough data in buffer for expected body
if (reader.Left() < content_length) return false;

m_body = reader.ReadLength(content_length);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah its true, a client could send a request with a huge list of HTTP headers and an incomplete body and we would parse the headers over and over again from m_recv_buffer until the body was finally completed, when we finally pinch off a HTTPRequest and erase the data from the buffer.

There certainly is an optimization there, keeping an in-progress request attached to HTTPClient with a "where we left off" pointer. It sounds like you would be ok with seeing that in a follow-up PR but let me know if its blocking for you here.

@romanz
Copy link
Contributor

romanz commented Jun 29, 2025

Low-latency HTTP request benchmark improvement - compared with #32541 (comment) taking ~130ms per request:

$ ab -k -c 1 -n 100000 http://localhost:8332/rest/txfromblock/000000000000000000017bfd05b5fa367a424c4a565a4baf7950d9e8605df8ec-5000.bin

Document Path:          /rest/txfromblock/000000000000000000017bfd05b5fa367a424c4a565a4baf7950d9e8605df8ec-5000.bin
Document Length:        234 bytes

Concurrency Level:      1
Time taken for tests:   11.554 seconds
Complete requests:      100000
Failed requests:        0
Keep-Alive requests:    100000
Total transferred:      31700000 bytes
HTML transferred:       23400000 bytes
Requests per second:    8655.35 [#/sec] (mean)
Time per request:       0.116 [ms] (mean)
Time per request:       0.116 [ms] (mean, across all concurrent requests)
Transfer rate:          2679.44 [Kbytes/sec] received

@romanz
Copy link
Contributor

romanz commented Jun 29, 2025

The latency of 404 response has improved too:

$ ab -k -c 1 -n 100000 http://localhost:8332/rest/
...
Time taken for tests:   2.876 seconds
Complete requests:      100000
Failed requests:        0
Non-2xx responses:      100000
Keep-Alive requests:    100000
Total transferred:      9500000 bytes
HTML transferred:       0 bytes
Requests per second:    34774.44 [#/sec] (mean)
Time per request:       0.029 [ms] (mean)
Transfer rate:          3226.14 [Kbytes/sec] received

master branch:

$ ab -k -c 1 -n 100000 http://localhost:8332/rest/
...
Time taken for tests:   3.805 seconds
Complete requests:      100000
Failed requests:        0
Non-2xx responses:      100000
Keep-Alive requests:    100000
Total transferred:      11400000 bytes
HTML transferred:       0 bytes
Requests per second:    26277.76 [#/sec] (mean)
Time per request:       0.038 [ms] (mean)
Transfer rate:          2925.45 [Kbytes/sec] received

Copy link
Contributor

@romanz romanz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested ACK e531a7c

@DrahtBot
Copy link
Contributor

DrahtBot commented Jul 8, 2025

🐙 This pull request conflicts with the target branch and needs rebase.


/** Change logging level for libevent. */
void UpdateHTTPServerLogging(bool enable);
namespace http_bitcoin {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just an idea, and not necessarily a suggestion for this PR since it could be a followup, but I was thinking since httpserver.h and httpserver.cpp files are basically rewritten here it could also be good to rename them to rpc/http/server.h and rpc/http/server.cpp and put them in a matching rpc::http namespace. Reasons for suggesting this:

  • To take some files out of the top level directory.
  • To choose a namespace name rpc::http that matches the directory path rpc/http.
  • To make space for a client library to be located at rpc/http/client.h rpc/http/client.cpp (for tools that may that want to use some bitcoin-cli functionality without shelling out to bitcoin-cli)
  • To make space for other RPC protocols to be implemented places like rpc/grpc/.
  • To make rpc/ and ipc/ layouts more consistent (since latter uses src/ipc/<protocol> structure).

One reason to not like this suggestion would be if you wouldn't consider other uses of the HTTP server to be RPC. But I would interpret "rpc" liberally and probably move the related files as well:

  • rest.h -> rpc/http/rest.h
  • rest.cpp -> rpc/http/rest.cpp
  • httprpc.h -> rpc/http/jsonrpc.h
  • httprpc.cpp -> rpc/http/jsonrpc.cpp

Again not pushing for this, just wanted to mention the idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants