Skip to content

Conversation

MSP-Greg
Copy link
Member

@MSP-Greg MSP-Greg commented Dec 22, 2024

Description

Currently, code that processes the request env is contained in both client.rb and request.rb. Move the code to client_env.rb, and add an additional test file test_request.rb. This doesn't require creating a server.

Will move more tests to the file, some from test_puma_server.rb and some from test_request_invalid.rb.

Closes #3540

Your checklist for this pull request

  • I have reviewed the guidelines for contributing to this repository.
  • I have added (or updated) appropriate tests if this PR fixes a bug or adds a feature.
  • My pull request is 100 lines added/removed or less so that it can be easily reviewed.
  • If this PR doesn't need tests (docs change), I added [ci skip] to the title of the PR.
  • If this closes any issues, I have added "Closes #issue" to the PR description or my commit messages.
  • I have updated the documentation accordingly.
  • All new and existing tests passed, including Rubocop.

@github-actions github-actions bot added the waiting-for-review Waiting on review from anyone label Dec 23, 2024
@@ -576,6 +576,7 @@ def lowlevel_error(e, env, status=500)

def response_to_error(client, requests, err, status_code)
status, headers, res_body = lowlevel_error(err, client.env, status_code)
res_body = ["Payload Too Large"] if status == 413
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just double checking my understanding: So with this refactoring, the lowlevel_error is called (and potentially a user specified lowlevel_error_handler), which didn't happen before? In order to preserve the existing behaviour – the hard-coded response body Payload Too Large we set the body here

Question: Should we even call lowlevel_error at all when status_code is 413? If we didn't run the user specified lowlevel_error_handler before this refactoring, we probably shouldn't start doing it now.

A future (major?) Puma version could let the user specified lowlevel_error_handler to control the full response for the "Payload Too Large" scenario.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we introduce HttpParserError413 and raise that instead of HttpParserError (we already have HttpParserError501)? Then we can, in client_error, call prepare_response directly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or is the HttpParserError raise at https://github.com/puma/puma/pull/3582/files#r1897406430 problematic, in that either "Payload Too Large" or some other error can happen?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is @error_status_code important or can we get by without it? I mean, can we pass the status code when we raise HttpParserError? I think that is a clearer interface.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re adding HttpParserError413, I'm not sure about the idea of having error classes for every possible request error. That's why I added @error_status_code. We might need to add something like @error_status_message or something like it.

I didn't want this PR to touch error handling, but once I got into it, it couldn't be avoided.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about changing the code to:

def response_to_error(client, requests, err, status_code)
  # @todo remove sometime later
  if status_code == 413
    status = 413
    res_body = ["Payload Too Large"]
    headers = {}
  else
    status, headers, res_body = lowlevel_error(err, client.env, status_code)
  end
  prepare_response(status, headers, res_body, requests, client)
end

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that will do it


But we could also pass information via the error object? Instead of adding @error_status_message, HttpParserError could hold status code and the message.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we could also pass information via the error object?

I didn't state one goal, of which this is the start. I think 'ruby' error handling can be totally contained in the new method Client#process_env_body. I'd like to contain it all within Client, except it will need the call to generate the response.

@dentarg
Copy link
Member

dentarg commented Dec 25, 2024

Re: 1f7a775 and 4f6b798, if done in one commit, git (on the command line, not github.com) can detect moved lines. That's really useful IMHO.

Comment on lines 291 to 334
def process_env_body
if above_http_content_limit(@parser.body.bytesize)
@http_content_length_limit_exceeded = true
@error_status_code = 413
end
temp = setup_body
normalize_env
req_env_post_parse
if @error_status_code
# @env[HTTP_CONNECTION] = 'close'
raise HttpParserError
end
temp
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can @error_status_code be anything other than 413 here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not with this PR. Future PR's that consolidate the request error handling will probably add more values.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that will do it

I added it and rebased.

normalize_env
req_env_post_parse
if @error_status_code
# @env[HTTP_CONNECTION] = 'close'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this, or is it a future TODO so you want to keep the reminder for it?

@MSP-Greg
Copy link
Member Author

Questions about exception/error handling during request processing and up to and including calling the app.

I believe there are four types of exceptions:

A. A socket error.

B. An invalid request is submitted which does not meet the specs for a proper request.

C. A request is submitted which fails due to user defined constraints like http_content_length_limit or supported_http_methods.

D. The app raises an exception.

Questions:

1. When is/should low_level_errorcalled? The comment seems to state that it is only called when the app raises an exception. So, that would mean that only ‘D’ from above would trigger a low_level_error call? At present, I think some type ‘C’ errors are also calling low_level_error.

2. Previous discussions have been concerned with ‘finger printing’ that Puma is the web server. If so, should all ‘production’ error responses not include a response body/content?

3. Should users have control over which error types are logged? Or, should all be logged?

4. Similar to above, should users have control over which error types generate a response via the lowlevel_error_handler?

@nateberkopec
Copy link
Member

  1. woof, I always thought it was cases A/B/C 😆 I sort of always thought of it as puma_error_handler but maybe I'm wrong...
  2. I swear that was already an issue in the past... can't find it now. Yes, probably?
  3. I think we should allow the low_level_error handler to return false, people can filter stuff out in there if they want.
  4. What's the current contract for the return value of low_level_error?

@MSP-Greg
Copy link
Member Author

MSP-Greg commented Aug 12, 2025

I'm working on a significant update to this, currently cleaning up tests.

When I'm finished, I'll help review PR's.

@MSP-Greg MSP-Greg force-pushed the 00-client-reorg branch 4 times, most recently from d018dc6 to b1827cb Compare August 17, 2025 21:31
@MSP-Greg
Copy link
Member Author

Today's Puma has grown quite a bit from its origins. Over time, server.rb (and also test_puma_server.rb) have often been the 'catch all' files for new features.

Back in October of 2020 (PR #2419), request.rb was added, which extracted most of the processing of the request and the generation of the response from server.rb. The file was included in Server, but it made it easier to keep track of request/response processing, leaving the code in server.rb to handle starts/restarts/stops, event loop, interact with the ThreadPool, etc.

Client contains the HttpParser, which reads and processes the request line and headers into the env info passed to the app. If the request contains content/body, Client reads it, and decodes it if needed.

Much of the code to modify and validate the env passed to the app was scattered between Client and Request. The code also raised errors when validation failed.

Since the HttpParser is contained in Client, this PR moves all the request related code to Client. Much of it is contained in the client_env.rb file, which is included in Client.

By encapsulating all the request code in Client, it becomes easier to consider changes. Also, since we can pass an IO like object (with a single request) to Client.new, the testing for single requests doesn't require creating a Server instance. See test/test_request_single.rb for examples. This removed about 100 lines of code from test/test_puma_server.rb.

There many tests that create an app to return a response of request env properties, send a single request, and then check the response. These can moved to test/test_request_single.rb. This PR moves some of them, but more remain.

Although not contained in this PR, renaming request.rb to response.rb would clear up its purpose. Best left for a time immediately before a new release.

@MSP-Greg
Copy link
Member Author

Not sure whether to open an issue regarding this.

The issue is when should lowlevel_error_handler be called. I believe there are five cases:

  1. The request is invalid based on RFC's.
  2. The request exceeds hard-coded length/qty limits.
  3. The request exceeds user defined length/qty limits (example - request content/body max size).
  4. The app raises an error.
  5. The return of the app is invalid.

Note that requests that are invalid based on 2 & 3 may be 'valid' requests, the limits may be set because they could be malicious requests.

This PR isn't tagged v7, but the refactoring is extensive...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactor waiting-for-review Waiting on review from anyone
Projects
None yet
Development

Successfully merging this pull request may close these issues.

http_content_length_limit causes Puma::HttpParserError
3 participants