Skip to content

HTTP/1 proxying: preserve header cases #1194

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Feb 22, 2017

Conversation

deweerdt
Copy link
Member

@deweerdt deweerdt commented Feb 9, 2017

This patch adds a new parameter: proxy.preserve-original-case. When
this parameter is set, we record the casing of the received headers
(both in the reponse and request), and when forwarding the request (or
the response), we apply the original casing.

@deweerdt
Copy link
Member Author

deweerdt commented Feb 9, 2017

@kazuho This PR isn't ready for merging: it's missing more testing, and it doesn't handle special headers like Connection: or Content-Length:. However I thought close enough to start to discuss the implementation.
My first intention was to keep the original buffer around and keep pointers to it, but on a second thought, it seemed it would be less memory consuming to keep the original casing in a bitfield as done here.
I don't see a good way to avoid the allocation in http1client.c, even if the feature is not in use, but other than that there should be no change of behavior if the option is off.
Any thoughts?

@deweerdt deweerdt force-pushed the proxy-preserve-original-header-casing branch 3 times, most recently from 2207a25 to 3cbf88d Compare February 9, 2017 03:51
This patch adds a new parameter: `proxy.preserve-original-case`. When
this parameter is set, we record the casing of the received headers
(both in the reponse and request), and when forwarding the request (or
the response), we apply the original casing.
- fix h2o_init_request to copy the casing information correctly
- simplify init_headers
@deweerdt deweerdt force-pushed the proxy-preserve-original-header-casing branch from c163cea to ec0eeb9 Compare February 9, 2017 16:16
Copy link
Member

@kazuho kazuho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR. I think this is a good work that improves interoperability.

I've left my comments in-line. Please let me know what you think.

@@ -46,7 +46,7 @@ typedef struct st_h2o_http1client_header_t {
typedef int (*h2o_http1client_body_cb)(h2o_http1client_t *client, const char *errstr);
typedef h2o_http1client_body_cb (*h2o_http1client_head_cb)(h2o_http1client_t *client, const char *errstr, int minor_version,
int status, h2o_iovec_t msg, h2o_http1client_header_t *headers,
size_t num_headers);
h2o_str_case_t **orig_hname_case, size_t num_headers);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about changing the argument: h2o_http1client_header_t *headers to h2o_header_t *?

Doing so, and tokenizing the names right after calling h2o_strtolower seems like the way we should go.

By doing so, costly comparisons of iterating through the headers in the later moments in the on_head function of lib/common/http1client.c can be optimized to use token comparisons (e.g. change h2o_memis(headers[i].name, headers[i].name_len, H2O_STRLIT("connection")) to headers[i].name == H2O_TOKEN_CONNECTION).

I do not think the additional amount of stack used by the list of h2o_header_t would be a issue, if we could leave the scope that allocates struct phr_header headers[100] before calling on_head.

include/h2o.h Outdated
/**
* If non-NULL, contains a bitfield indicating whether a character in @name was upper case
*/
h2o_str_case_t *orig_hname_case;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unsure if we want to save memory at the cost of spending extra cycles for building the header in the original case. Unless you have a need, lets just store a char * that points to the characters in the original case.

include/h2o.h Outdated
* a boolean flag if set to true, instructs the proxy to forward headers in the same case they were received
* (only active in the HTTP/1 case)
*/
int preserve_original_case;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's forget about making this a configuration directive.

I think always preserving the case of headers for requests originating from HTTP/1 protocol would be fine.

  storing the original case
- Pass h2o_header_t instead of h2o_http1client_header_t to on_head and
  informational_cb. This allows to get rid of h2o_http1client_header_t
  entirely, and perform the tokenization in http1client
- Free the original case in http1client.c:on_head
- Fix test labels
@deweerdt
Copy link
Member Author

deweerdt commented Feb 13, 2017

@kazuho thank you for your comments. I've addressed them in 486ec19, I believe. It seemed natural to remove h2o_http1client_header_t entirely, after doing the tokenization in http1client.c, let me know if that works for you.

Copy link
Member

@kazuho kazuho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the changes! Adding a field to a struct that we use at so many places is a non-trivial amount of work, thank you very much.

I think we are on the right direction. I have left comments in-line.

include/h2o.h Outdated
@@ -1088,12 +1092,13 @@ ssize_t h2o_find_header_by_str(const h2o_headers_t *headers, const char *name, s
/**
* adds a header to list
*/
void h2o_add_header(h2o_mem_pool_t *pool, h2o_headers_t *headers, const h2o_token_t *token, const char *value, size_t value_len);
h2o_header_t *h2o_add_header(h2o_mem_pool_t *pool, h2o_headers_t *headers, const h2o_token_t *token, const char *value,
size_t value_len);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding orig_hname as an argument instead of returning h2o_header_t * so that you could set the value in the caller?

The same goes for h2o_add_header_by_str as well.

include/h2o.h Outdated
/**
* The name of the header as originally received from the client
*/
const char *orig_hname;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the order of the fields should be name -> orig_hname -> value?

I think we should consider orig_hname as a first-class member of h2o_header_t (we need to always take care of the value!). In this respect, having fields for the name first makes more sense to me.

Also, it might be better to rename the field to orig_name. We do not need h since this struct represents a header.

@deweerdt deweerdt force-pushed the proxy-preserve-original-header-casing branch 2 times, most recently from 59c6951 to f3a907e Compare February 14, 2017 15:13
@deweerdt deweerdt force-pushed the proxy-preserve-original-header-casing branch from f3a907e to 130b457 Compare February 14, 2017 19:17
@deweerdt
Copy link
Member Author

@kazuho thank you for your comments. I believe i've addressed all your comments so far in 130b457

Copy link
Member

@kazuho kazuho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the modification! I think we have become more error-prone thanks to the changes.

May I ask one more change before merging this to master?

const h2o_token_t *token;
char *orig_name;

orig_name = h2o_strdup(NULL, src_headers[i].name, src_headers[i].name_len).base;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about creating a memory pool as a local variable of the function (i.e. h2o_mem_pool_t pool; h2o_mem_init_pool(&pool) at the top scope of the function), and allocate memory from that pool instead of calling h2o_strdup (and free) for every header name?

I believe that that would be a good optimization considering the fact that h2o_mem_alloc_pool is super-fast for small objects.

We could also allocate headers of type h2o_headers_t using the pool (instead of h2o_headers_t *headers[100]).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, will do. Thanks!

@deweerdt
Copy link
Member Author

@kazuho e843a04 now uses a pool allocator in http1client.c::on_head. Thank you.

@kazuho kazuho merged commit 7912487 into h2o:master Feb 22, 2017
@kazuho
Copy link
Member

kazuho commented Feb 22, 2017

Thank you for the changes. Merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants