Skip to content

Conversation

stevenzzzz
Copy link
Contributor

replace FactoryContext with ServerFactoryContext in router/..., So that SRDS/RDS/VHDS subscription could outlive their creator listener.

Signed-off-by: Xin Zhuang stevenzzz@google.com

Description:replace FactoryContext with ServerFactoryContext in router/..., So that SRDS/RDS/VHDS subscription could outlive their creator listener.
This PR:

  • Let ListenerImpl only pass down the dynamic_init_manager_ it owns to its children entities (HCMConfig, SRDS-Subscription, RDS-Subscription etc).
  • Construct a tree like init-manager & target dependency graph, by adding a init-target at each listener-impl, srds-subscription, rds-subscription.
  • Replace FactoryContext with ServerFactoryContext, a SRDS/RDS subscription can out live its creator listener impl for real now.

Risk Level: Med
There is one behavior change, that previously during a listenerImpl's initManager initializing [which could be either Server::initManager or Listener::initManager], if there are multiple SRDS or RDS responses that contains sub-resource, their targets are added to the listener::initManager() and will bar on the listerner::initManager() ready. With this modification, the sub-resources contains in the second response of a SRDS/RDS may not be able to bar the local-init-manager's ready(), if the first response's sub-resources are initialized and called ready on the SRDS's local init-manger ready() aheader of the second's response's arrival.
before==> SRDS(A, B, C) bar Listener::initManager(),
now ===> SRDS::local_init_manager bars Listener::initManager(), and (A, B) bars on local_init_manager, (C) in a second response may or may not bar on Listener::initManager() depending on if (A,B) finish initialization before (C)'s arrival.
Testing: unit test
Docs Changes:
Release Notes:
[Optional Fixes #Issue]
[Optional Deprecated:]

…t SRDS/RDS/VHDS subscription could outlive their creator listener

Signed-off-by: Xin Zhuang <stevenzzz@google.com>
Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General approach looks good, thanks, but I think there's still some issues around validation visitor inference with this PR.
/wait

@@ -21,6 +21,7 @@ ImmutableConfigProviderBase::~ImmutableConfigProviderBase() {

ConfigSubscriptionCommonBase::~ConfigSubscriptionCommonBase() {
init_target_.ready();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: please avoid gratuitous whitespace changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

: config_(new ConfigImpl(config, factory_context.getServerFactoryContext(),
factory_context.messageValidationVisitor(), true)),
: config_(new ConfigImpl(config, factory_context,
factory_context.messageValidationContext().staticValidationVisitor(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this still might not be correct; what if you have an inline route in an HCM that is delivered via a Listener in a LDS message? The validation should be dynamic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert it. This PR is supposed to be a child PR of #9621.

But we can also work on this PR to solve the problem in one shot. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we don't want FactoryContext to contain a messageValidationVisitor() method, I can do another cleanup PR which will pipe the validator from ListenerImpl --> ... --> HCM::Config.

parent_init_target_(fmt::format("RdsRouteConfigSubscription init {}", route_config_name_),
[this]() { local_init_manager_.initialize(init_watcher_); }),
init_watcher_(fmt::format("RDS local-init-watcher {}", rds.route_config_name()),
[this]() { parent_init_target_.ready(); }),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me and is roughly how I was envisaging transitive init managers working. @mergeconflict @lambdai did you have any thoughts here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Based on my understanding this field initialization order doesn't introduce initialized-after-detroy

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine, if it's needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK

@htuch htuch self-assigned this Jan 24, 2020
Comment on lines 230 to 232
std::unique_ptr<Init::WatcherImpl> init_watcher_;
// A target is added to Server's InitManager if workers_started_ is false.
std::unique_ptr<Init::TargetImpl> listener_init_target_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the ListenerImpl is in destroyed, listener_init_target_ is destroyed prior to init_watcher_
Is it possible it triggered init_watcher_ in between?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ListenerImpl destruction is supposed to be an "atomic" operation in master thread, same as listener_init_target_'s triggering. I think it's fine here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the churn. I agree it's fine. Although for different reason :)
On top of ~ListenerImpl() there might be ListenerManagerImpl::startWorkers() if this listener is the last uninitialized listener. It's fine because this destroying listener won't in the start list

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK.

mergeconflict
mergeconflict previously approved these changes Jan 24, 2020
Copy link
Member

@mergeconflict mergeconflict left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just left a few comments here regarding init manager / target / handle usage (I didn't pay attention to anything else).

@@ -224,13 +225,15 @@ class ListenerImpl : public Network::ListenerConfig,
const uint64_t hash_;
ProtobufMessage::ValidationVisitor& validation_visitor_;

// This init watcher, if available, notifies the "parent" listener manager when listener
// initialization is complete. It may be reset to cancel interest.
std::unique_ptr<Init::WatcherImpl> init_watcher_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it was already like this, but I don't think this needs to be a unique_ptr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// initialization is complete. It may be reset to cancel interest.
std::unique_ptr<Init::WatcherImpl> init_watcher_;
// A target is added to Server's InitManager if workers_started_ is false.
std::unique_ptr<Init::TargetImpl> listener_init_target_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't really need to be a unique_ptr either. I'd recommend always creating the target, so that you don't have to worry about ASSERT(listener_init_target_ != nullptr), and just add it to the parent's server's init manager if workers haven't been started.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Init::SharedTargetImpl parent_init_target_;

// Init watcher on RDS and VHDS ready event. This watcher marks parent_init_target_ ready.
Init::WatcherImpl init_watcher_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to call this parent_init_watcher_ or local_init_watcher_ (I'm not sure which) for consistency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved to config-provider-framework now. named to local_init_watcher_.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

latest init manager related changes still lgtm

Thanks!

parent_init_target_(fmt::format("RdsRouteConfigSubscription init {}", route_config_name_),
[this]() { local_init_manager_.initialize(init_watcher_); }),
init_watcher_(fmt::format("RDS local-init-watcher {}", rds.route_config_name()),
[this]() { parent_init_target_.ready(); }),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine, if it's needed.

Signed-off-by: Xin Zhuang <stevenzzz@google.com>
@stevenzzzz stevenzzzz force-pushed the final-look-longlive-srds branch from abc80ba to e8de6c1 Compare January 27, 2020 16:56
@stevenzzzz stevenzzzz changed the title WIP[listener, srds, rds, vhds]: replace FactoryContext with ServerFactoryContext in router/... listener, srds, rds, vhds: replace FactoryContext with ServerFactoryContext in router/... Jan 27, 2020
Signed-off-by: Xin Zhuang <stevenzzz@google.com>
mergeconflict
mergeconflict previously approved these changes Jan 27, 2020
Copy link
Member

@mergeconflict mergeconflict left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

init manager related changes lgtm, thanks!

Signed-off-by: Xin Zhuang <stevenzzz@google.com>
Signed-off-by: Xin Zhuang <stevenzzz@google.com>
Signed-off-by: Xin Zhuang <stevenzzz@google.com>
mergeconflict
mergeconflict previously approved these changes Jan 31, 2020
Copy link
Member

@mergeconflict mergeconflict left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

latest init manager related changes still lgtm

Signed-off-by: Xin Zhuang <stevenzzz@google.com>
Copy link
Contributor Author

@stevenzzzz stevenzzzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@junr03
Copy link
Member

junr03 commented Feb 6, 2020

@stevenzzzz sorry for the delay here. I have been a bit swamped these days. Will prioritize taking a look tomorrow.

Copy link
Member

@junr03 junr03 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for working on this. A few comments

@@ -55,6 +55,12 @@ class CommonFactoryContext {
*/
virtual const LocalInfo::LocalInfo& localInfo() const PURE;

/**
* @return ProtobufMessage::ValidationContext& validation visitor for xDS&static configuration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @return ProtobufMessage::ValidationContext& validation visitor for xDS&static configuration
* @return ProtobufMessage::ValidationContext& validation visitor for xDS and static configuration

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

*/
virtual RouteConfigProviderPtr
createStaticRouteConfigProvider(const envoy::config::route::v3::RouteConfiguration& route_config,
Server::Configuration::FactoryContext& factory_context) PURE;
Server::Configuration::ServerFactoryContext& factory_context,
ProtobufMessage::ValidationVisitor& validator) PURE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to read the rest of the PR, but given the change below which add messageValidationVisitor() to the CommonFactoryContext, why is this function also taking a validator as a parameter?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevenzzzz I am still wondering about this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, missed this comment. I think you mean "add a validation visitor context"?
It's because we can't tell from the current ServerFactoryContext if the static validator or the dynamic validator should be used.
For example, a static Route config provider could be put into a dynamic(LDS) listener's config, in which case we should use the dynamic validator instead (per Harvey).
For that, we will have to pass validator alongside the Serverfatorycontext.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, so only the code that is building the config provider knows if it should be using the static or dynamic visitor, but the constructor itself does not. Hence why the caller of createStaticRouteConfigProvider needs to pass the intended visitor.

Not to bike shed, but it seems that we have gone in one direction here, and the opposite in the ListernerImpl constructor where we used to pass the validator reference but now we decide inside the constructor. Should we pass an added_via_api boolean here and have the constructor decide?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, I'd like the validator choice made as close to the consumer entity(in this case, the listenerImpl) as possible, which provides better modularity.

The listener could make the decision by itself as there is a somewhat hacky paramter add_via_api in its constructor, the added_via_api information could be deducted by any RDS-config-provider or any subscription-initiated config-provider,
but not in static config provider: its context doesn't provide enough information for making the choice.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we pass an added_via_api boolean here and have the constructor decide?

Right, that is why I was suggesting if we should hint the static config provider as well (albeit I have not looked if we have access to the same hint when we create this config provider, but my intuition tells me we do). I am not sure what the right answer here is. My point is that we seem to have gone in opposite directions where we could have the same setup.

Perhaps the next reviewer can state their preference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discussed this w/ Harvey before, passing the correct validator is preferred over hinting (passing the added_via_api signal). Agreed it's a preference thing as to if we want to keep all the places have the same interface.

Init::TargetImpl parent_init_target_;
// Watcher that marks parent_init_target_ ready when the local init manager is ready.
Init::WatcherImpl local_init_watcher_;
// Local manager that tracks the subscription's sub-resources(RDS subscriptions) initialization.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Local manager that tracks the subscription's sub-resources(RDS subscriptions) initialization.
// Local manager that tracks the subscription's sub-resources initialization. For example, RDS in LDS.

Because RDS is just one example, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -170,13 +171,14 @@ class ScopedRdsConfigSubscription : public Envoy::Config::DeltaConfigSubscriptio

// ScopedRouteInfo by scope name.
ScopedRouteMap scoped_route_map_;
// For creating RDS subscriptions.
Server::Configuration::ServerFactoryContext& factory_context_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did this move in order?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert it. It's unnecessary now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping on the revert

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pong.
I think I updated this in another backup branch, oops.

@@ -392,18 +411,13 @@ void ListenerImpl::initialize() {
// per listener init manager. See ~ListenerImpl() for why we gate the onListenerWarmed() call
// by resetting the watcher.
if (workers_started_) {
dynamic_init_manager_.initialize(*init_watcher_);
// If workers_started_ is true, dynamic_init_manager_ should be initialized by listener manager
// directly..
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// directly..
// directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

@@ -309,7 +326,6 @@ ListenerImpl::~ListenerImpl() {
// a local init manager we should block the notification from trying to move us from warming to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think I follow what happened here in terms of changing having different init managers and resetting this watch, vs having the new target and arrangement of firings.

However, there are some outdated comments throughout this file. The above for instance, as this destructor does not any longer reset the init_watcher_. Do you mind going through the file and deleting and updating comments. I think the process here is tricky and being extra crisp on what is going on would make the next person reading this have an easier time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, this is back!
I have a major git sync problem with master and had to resolve w/ upstream master one by one. I will do a around of comment cleanup.

Signed-off-by: Xin Zhuang <stevenzzz@google.com>
Signed-off-by: Xin Zhuang <stevenzzz@google.com>
Signed-off-by: Xin Zhuang <stevenzzz@google.com>
@stevenzzzz
Copy link
Contributor Author

At a high level this LGTM. @wgallagher do you mind reviewing just the xDS plumbing part of the change?

@stevenzzzz given how complicated this code is, can we potentially get an integration test that covers a very complex init case? IMO this shouldn't be that hard given all the plumbing we have for writing xDS integration tests?

/wait

PTAL
Sorry for the delayed reply, lots going on recently. I added a integration test w/ name listener_lds_integration_test.cc.

Signed-off-by: Xin Zhuang <stevenzzz@google.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for a ton for adding the integration test, nice work. Small comment nit and needs a format fix.

/wait

INSTANTIATE_TEST_SUITE_P(IpVersionsAndGrpcTypes, ListenerIntegrationTest,
GRPC_CLIENT_INTEGRATION_PARAMS);

// Tests that a LDS deletion before Server initManager been initialized will block the Server from
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/will block/will not block ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aah, that's the whole point of this test. :0

Signed-off-by: Xin Zhuang <stevenzzz@google.com>
@stevenzzzz
Copy link
Contributor Author

Thanks!

@mattklein123
Copy link
Member

CI issue looks legit. Can you merge master and take a look?

/wait

…al-look-longlive-srds

Signed-off-by: Xin Zhuang <stevenzzz@google.com>
Signed-off-by: Xin Zhuang <stevenzzz@google.com>
Signed-off-by: Xin Zhuang <stevenzzz@google.com>
@stevenzzzz
Copy link
Contributor Author

probably because of difference between libc++ and libstdc++, can't reproduce the error on my local box.

@stevenzzzz
Copy link
Contributor Author

/retest
not sure if the failing test is related to this PR.

@repokitteh-read-only
Copy link

🤷‍♀️ nothing to rebuild.

🐱

Caused by: a #9779 (comment) was created by @stevenzzzz.

see: more, trace.

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@mattklein123 mattklein123 merged commit 0957bfe into envoyproxy:master Mar 4, 2020
@stevenzzzz
Copy link
Contributor Author

stevenzzzz commented Mar 4, 2020 via email

@stevenzzzz stevenzzzz deleted the final-look-longlive-srds branch March 4, 2020 02:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants