Skip to content

Conversation

soya3129
Copy link
Contributor

Description: The change enables request metadata consuming, proxying and inserting. Detailed implementation is described in the doc source/docs/h2_metadata.md.

Risk Level: Low. Not run in production.
Testing: Integration test.
Docs Changes: source/docs/h2_metadata.md
#2394

Yang Song added 20 commits December 3, 2018 20:13
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
@soya3129
Copy link
Contributor Author

soya3129 commented Jan 18, 2019

Cc @birenroy

@@ -893,6 +906,13 @@ void ConnectionManagerImpl::ActiveStream::decodeData(ActiveStreamDecoderFilter*
ENVOY_STREAM_LOG(trace, "decode data called: filter={} status={}", *this,
static_cast<const void*>((*entry).get()), static_cast<uint64_t>(status));

if (!request_metadata_map_vector_.empty()) {
for (const auto& metadata_map : request_metadata_map_vector_) {
decodeMetadata((*entry).get(), *metadata_map);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the request direction, it's more straight-forward to pass the metadata to downstream filters only. But to be consistent with the response direction, shall I pass metadata through all filters? I think we may be able to do something similar on the response direction to enable downstream only filter pass. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think I'd prefer consistency over a one off

Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
Copy link
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, this is a great start! I have some high level comments - up to you which you want to address here and which you want to do in a follow up. I'll do another full pass once you make a call either way :-)

// indicate the end of the stream.
if ((*entry)->end_stream_) {
Buffer::OwnedImpl empty_data("");
addDecodedData(*((*entry).get()), empty_data, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind adding a trace log here, to make it clear what we're doing?

I'm not sure if this plays well with ContinueAndEndStream where we set decoding_headers_only_ true. Can we do some tests with GrpcHttp1ReverseBridge to make sure we don't get a body payload but we do get an end stream correctly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologize for the long response!

Added a trace log.

Great catch!! It wasn't working with decoding_headers_only_! To solve this problem, I tried to pass the empty data frame (added by addDecodedData) through filters when decoding_headers_only_ is true, but failed. The main problem is hard to distinguish this empty data frame with the data frames that are received from downstream (which we will ignore because of decoding_headers_only_). We could add more conditions to allow empty data frame to go through the filters here:

if (decoding_headers_only_) {
. But then I can't think of a way to set end_stream true in the empty data frame. I think complete() has to be true in order to set the end_stream in the empty data frame (
doData(complete() && !trailers());
). However, if I set complete() to be true, the newly received data frame from downstream will cause assert failure (complete() is true means we are done with downstream, shouldn't receive more data).

A very hacky solution I used is to add an empty data in router filter when end_stream in headers is true and we still have metadata to send. The downside is the data frame won't go through filters (Since filters decides to process headers only, it might be OK). The upside is it's simple.

But now the code is messy: if metadata is added in headers only request, the added empty data frame will go through filters. If metadata is added with decoding_headers_only_ true, the added empty data will not go through filters. I could merge those two cases to one: only add new empty data in router filters. What do you think?

About test: GrpcHttp1ReverseBridge returns ContinueAndEndStream in encodeHeaders(). So I used decode-headers-only to test, which returns ContinueAndEndStream in decodeHeaders(). Did I miss anything?

Please let me know any suggestions/thoughts. Thanks!!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. Could we bypass this problem if we allowed metadata to end stream or do you think it would just be shifting complexity around?

I'd be interested in the opinion of @lizan and @mattklein123 - decoding_headers_only_ path was added for gRPC translations where we want to move the body into a header-only response, but now for metadata we're adding back a 0-byte body to pass on the end-stream, and want to make sure we don't reintroduce the actual payload.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metadata is sent as h2 extension frames. Some implementations may ignore or mishandle this frame type. It could potentially cause problems if we allow metadata to carry end_stream. But if we think it is not a concern, I can change to allow end_stream. If we do so, we may differ from GFE implementation.

@@ -893,6 +906,13 @@ void ConnectionManagerImpl::ActiveStream::decodeData(ActiveStreamDecoderFilter*
ENVOY_STREAM_LOG(trace, "decode data called: filter={} status={}", *this,
static_cast<const void*>((*entry).get()), static_cast<uint64_t>(status));

if (!request_metadata_map_vector_.empty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a bit of a bummer we need these code blocks draining metadata in all these locations. I can't think of anything more elegant but maybe @mattklein123 can?

Meanwhile if we leave it in, it might be worth a utility function (moving the "add 0 byte data frame out of the block above) just to minimize code in the already long functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a utility function. Thanks!

@@ -423,6 +426,7 @@ class ConnectionManagerImpl : Logger::Loggable<Logger::Id::http>,
absl::optional<Router::RouteConstSharedPtr> cached_route_;
absl::optional<Upstream::ClusterInfoConstSharedPtr> cached_cluster_info_;
DownstreamWatermarkCallbacks* watermark_callbacks_{nullptr};
MetadataMapVector request_metadata_map_vector_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mind commenting what metadata ends up here and why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 things for thought:

  1. Related to some of my other comments, can we potentially just be directly storing pointers to metadata maps so that we can avoid copies?
  2. I'm wondering if we want to actually have some type of metadata storage structure that is created on demand? So few people will be using this feature that I'm wondering if it's worth the memory and small init cost in the common case? Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To answer 1: Storing metadata as unique_ptr instead of MetadataMapVector (aka std::vector<unique_ptr>) will merge two metadata with the same key value into one. This may cause some problem when multiple copies of the same metadata are expected, for example, for logging purpose.

To answer 2: I think that's good suggestion. Do you mind I make that change in a followup PR? Added a TODO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To answer 2: I think that's good suggestion. Do you mind I make that change in a followup PR? Added a TODO.

I think that's OK, but will you do it relatively quickly? I think we need to be mindful of the performance/memory impact of rarely used code in the hottest paths.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Fixed in this PR.

Http::MetadataMapPtr metadata_map_ptr = std::make_unique<Http::MetadataMap>(metadata);

if (upstream_request_ == nullptr) {
ENVOY_STREAM_LOG(trace, "upstream_request_ not ready. Store metadata_map to encode later: {}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Store -> Storing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Thanks!

ENVOY_STREAM_LOG(trace, "Send metadata before sending data. {}", *parent_.callbacks_,
parent_.downstream_metadata_map_vector_);
request_encoder_->encodeMetadata(parent_.downstream_metadata_map_vector_);
parent_.downstream_metadata_map_vector_.clear();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a test for metadata and upstream retries? I think we're going to have to store this somewhere. Ditto shadowing, and internal redirects, where we call recreateStream.

I'm also OK TODOing either or both since both are complicated and this PR is already complicated. It's possible designing for them may inform how and where we store and clear frames so it's worth at least thinking about it now.

Happy to give you pointers to code or tests if you have trouble finding them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point!! Thanks! I didn't think of those cases at all.

I am still working on this part. But I agree we should have a design for them before submitting this PR in case there are changes. Will update this thread when I make progress.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only have looked at the internal redirect case. Since those cases all call recreateStream(), I think they can be solved similarly.

recreateStream() calls decodeHeaders() to go through all filters at the end of the the function.

new_stream.decodeHeaders(std::move(request_headers), true);
. This allows us to add metadata again for the redirected request, so I think we may not need extra handling for the redirect case in envoy.

The issue is in nghttp2. In the redirect case, metadata ends up in session->ob_reg: https://github.com/nghttp2/nghttp2/blob/d93842db3eb481fc6953ec70a81ed2f7863ffb45/lib/nghttp2_session.c#L2352, and the redirect headers ends up in session->ob_syn: https://github.com/nghttp2/nghttp2/blob/d93842db3eb481fc6953ec70a81ed2f7863ffb45/lib/nghttp2_session.c#L2360 . Because session->ob_reg is parsed before session->ob_syn, metadata is received before headers (no new stream has created), and caused failure.

I switched the positions of the pointed code, and I don't see new test failures in nghttp2. And a simple internal redirect test case passed with metadata in envoy. (cl/231599606. If others are interested in the change, I can make a public available PR).

What do you think about making the nghttp2 change? If we go with this, we can submit this PR first, and deal with shadowing, internal rediects and retries after the change is pushed to nghttp2. But it may take longer. I notice envoy still uses an old version of nghttp2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add more details about why the nghttp2 fix (here) is needed in redirect, etc cases, but not needed in regular case.

In the regular case, when we call encodeHeaders() in onPoolReady(), we will submit the headers by calling sendPendingFrames() before encoding any other frames. Calling sendPendingFrames() will make sure headers (session->ob_syn) are popped for processing first, because that's the only frame available. In redirect case, sendPendingFrames() in encodeHeaders() will be skipped because we are still dispatching (dispatching_ is true here). So we accumulate both headers and metadata. When sendPendingFrames is executed at the end of dispatch(), metadata shows up first because it is in session->ob_reg (here), which is processed before session->ob_syn(here).

It's easiest, and I think it also makes sense to fix in nghttp2. (sync frame should always process before regular frames?). If it takes too long, I will try to find some hacky ways to fix it in Envoy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As said I'm totally fine dealing with this later, I'd just like a tracking TODO :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Thanks!

Signed-off-by: Yang Song <yasong@google.com>
@mattklein123
Copy link
Member

@alyssawilk is it your turn or mine? :)

@alyssawilk
Copy link
Contributor

alyssawilk commented Jun 12, 2019 via email

@mattklein123
Copy link
Member

ACK. Will review. Sorry for the delay @soya3129.

@mattklein123 mattklein123 added the no stalebot Disables stalebot from closing an issue label Jun 13, 2019
@mattklein123
Copy link
Member

@soya3129 sorry for the huge delay. Now that you are back from vacation can you merge master and I will review once done?

/wait

Yang Song and others added 5 commits June 25, 2019 00:24
Signed-off-by: Yang Song <yasong@yasong-macbookair.roam.corp.google.com>
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the super long delay on review. Very nice work. This LGTM and I'm really happy with how much we simplified this PR along the way. @alyssawilk do you want to take a final pass? I have only small comments at this point. Thank you!

/wait


for (; entry != decoder_filters_.end(); entry++) {
// If the filter pointed by entry has stopped for all frame type, stores metadata and return.
// If the filter pointed by entry hasn't returned from decodeHeaders, stores newly added
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry when does this happen? Does this somehow happen if inside the headers callback the filter adds metadata? Can you clarify that if so in the comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right. Added comments to clarify.

@@ -203,13 +227,25 @@ class ConnectionManagerImpl : Logger::Loggable<Logger::Id::http>,
parent_.decodeData(this, *parent_.buffered_request_data_, end_stream,
ActiveStream::FilterIterationStartState::CanStartFromCurrent);
}
void doMetadata() override { drainSavedRequestMetadata(); }
MetadataMapVector& saved_metadata() override { return *getSavedRequestMetadata(); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: savedMetadata()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Thanks!

void doTrailers() override { parent_.decodeTrailers(this, *parent_.request_trailers_); }
const HeaderMapPtr& trailers() override { return parent_.request_trailers_; }

void drainSavedRequestMetadata() {
for (auto& metadata_map : *getSavedRequestMetadata()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps ASSERT that the storage vector pointer is not nullptr so we don't end up creating an empty vector anyway? Can you audit other calls to make sure we aren't accidentally creating a vector just to check that it's empty?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Fixed in both request and response direction.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have a parallel assert in drainSavedResponseMetadata?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Thanks!

@@ -521,6 +577,12 @@ class ConnectionManagerImpl : Logger::Loggable<Logger::Id::http>,
absl::optional<Router::RouteConstSharedPtr> cached_route_;
absl::optional<Upstream::ClusterInfoConstSharedPtr> cached_cluster_info_;
std::list<DownstreamWatermarkCallbacks*> watermark_callbacks_{};
// Stores metadata added in the decoding filter that is being processed. Will be cleared before
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice that we have similar storage on a per-filter basis as well as on a connection basis. Is it possible to collapse that somehow? If not can you add some clarifying comments on why we need both types of storage? It's not immediately clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comments and removed not used variables.

@@ -1477,8 +1501,13 @@ void Filter::UpstreamRequest::onPoolReady(Http::StreamEncoder& request_encoder,
}

upstream_timing_.onFirstUpstreamTxByteSent(parent_.callbacks_->dispatcher().timeSource());

bool end_stream = !buffered_request_body_ && encode_complete_ && !encode_trailers_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: const

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Thanks!

Signed-off-by: Yang Song <yasong@google.com>
Copy link
Contributor Author

@soya3129 soya3129 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review!!

Signed-off-by: Yang Song <yasong@google.com>
Copy link
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only minor nits from me. If we get all nits addressed I have a minor preference to merge next monday over today, just because so many maintainers are out over the (for-the-states) holiday weekend

commonDecodePrefix(filter, FilterIterationStartState::CanStartFromCurrent);

for (; entry != decoder_filters_.end(); entry++) {
// If the filter pointed by entry has stopped for all frame type, stores metadata and return.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return -> returns

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

void doTrailers() override { parent_.decodeTrailers(this, *parent_.request_trailers_); }
const HeaderMapPtr& trailers() override { return parent_.request_trailers_; }

void drainSavedRequestMetadata() {
for (auto& metadata_map : *getSavedRequestMetadata()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have a parallel assert in drainSavedResponseMetadata?

mattklein123
mattklein123 previously approved these changes Jul 3, 2019
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo @alyssawilk comments. Will defer to her for final review/merge. Great work!

Signed-off-by: Yang Song <yasong@google.com>
Signed-off-by: Yang Song <yasong@google.com>
@soya3129
Copy link
Contributor Author

soya3129 commented Jul 4, 2019

Thanks for the reviews!

@mattklein123
Copy link
Member

Assuming we cut 1.11.0 next week, I'm tempted to hold this until after that. @soya3129 any objections?

@soya3129
Copy link
Contributor Author

soya3129 commented Jul 6, 2019

Nope! Thanks for letting me know.

@mattklein123 mattklein123 merged commit c41bfbd into envoyproxy:master Jul 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no stalebot Disables stalebot from closing an issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants