Skip to content

Conversation

mrice32
Copy link
Member

@mrice32 mrice32 commented Sep 26, 2017

Add configurable tags for stats objects. As a part of this process, a few new concepts were added:

TagExtractor - an object that is used to extract a particular tag from stat names.
Metric - a general interface for metrics that provides identifying information.
Histograms - these existed in some form, but now there is an explicit class for them. Note: I didn't add a macro for the creation of these. This is mainly because it would be unused, but I am open to adding one if there's demand.

As a part of this addition, there is a set of default regexes. These have been tested, but I am especially interested in comments on the fragility of these. I'm somewhat new to writing regex, so please let me know if you find any issues.

This is still missing tests for the configuration and reading via a test sink. I will be adding these, but wanted to go ahead and get this out for review to get a head start.

@mrice32 mrice32 force-pushed the stats_tags branch 2 times, most recently from e4fac1f to a926107 Compare September 26, 2017 22:12
Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just starting to look at this, it's a big PR. At a high level it looks good. Will have some more concrete feedback tomorrow.

* Updates the tag extracted name and the set of tags by extracting the tag represented by this
* TagExtractor. If the tag is not represented in the current tag_extracted_name, nothing will be
* modified.
* @param[in,out] tag_extracted_name name from which the tag will be removed if found to be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side comment: I don't think I've seen this @param syntax used before in Envoy, but as long as it's valid Doxygen then seems useful. I think to some extent this is implied by argument constness already though.

Copy link
Member Author

@mrice32 mrice32 Sep 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note on that (may be irrelevant considering we're going to make this the return value). A lack of const only tells you that the object will be written to. That's generally the default for non-const inputs (at least for simple stdlib containers and primitives). However, the tag_extracted_name's current state will change the logical behavior of the function (whether a tag is actually extracted) and the state will be modified, so it's [in,out].

* Updates the tag extracted name and the set of tags by extracting the tag represented by this
* TagExtractor. If the tag is not represented in the current tag_extracted_name, nothing will be
* modified.
* @param[in,out] tag_extracted_name name from which the tag will be removed if found to be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably find it cleaner to make the argument immutable and just generate a new tag extracted name. I don't think this is on any hot path so no performance issues from an additional allocation.

@@ -207,6 +278,11 @@ class StoreRoot : public Store {
virtual void addSink(Sink& sink) PURE;

/**
* Add an extractor to extract a portion of stats names as a tag.
*/
virtual void addTagExtractor(TagExtractor& tag_extractor) PURE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be const TagExtractor& since updateTags is const?

Signed-off-by: Matt Rice <mattrice@google.com>
Signed-off-by: Matt Rice <mattrice@google.com>
Signed-off-by: Matt Rice <mattrice@google.com>
Signed-off-by: Matt Rice <mattrice@google.com>
Signed-off-by: Matt Rice <mattrice@google.com>
Signed-off-by: Matt Rice <mattrice@google.com>
Signed-off-by: Matt Rice <mattrice@google.com>
* Updates the tag extracted name and the set of tags by extracting the tag represented by this
* TagExtractor. If the tag is not represented in the current tag_extracted_name, nothing will be
* modified.
* @param tag_extracted_name name from which the tag will be removed if found to be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename to name_with_tag_included or something, now that it's no longer the extracted name in the arg?

std::unordered_map<std::string, std::string> regex_mapping;

// cluster.<cluster_name>.
regex_mapping[CLUSTER_NAME] = "^cluster\\.((.*?)\\.)";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of ? here, given it follows a pattern that already matches anything? Also, .* is generally too powerful, since regexes will do longest match, and this will walk all over the next .. What you need is probably something like ^cluster\\.(([^.]*))\\.. Same comment applies below to other regexes.

Copy link
Member Author

@mrice32 mrice32 Sep 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ? is meant to make the quantifier ungreedy. I'm not sure if this regex syntax is different from others, but see http://www.cplusplus.com/reference/regex/ECMAScript/ . Your suggestion seems to work, but I believe the above works too. I'll move to the ^. way of doing it since that seems simpler.

"^http(?=\\.).*?\\.dynamodb\\..+?(\\.__partition_id=(\\w{7}))$";

// cluster.<route target cluster>.grpc.<grpc service>.
regex_mapping[GRPC_BRIDGE_SERVICE] = "^cluster(?=\\.).*?\\.grpc\\.((.+?)\\.)";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are you trying to do this the (?= stuff here?

Copy link
Member Author

@mrice32 mrice32 Sep 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a look forward. Because it may be cluster.cluster_name.grpc or cluster.grpc depending on the order in which these are processed (or not enabled, etc), this ensures that there is a . after cluster, but does not "consume" the . in case there is just a single . between cluster and grpc.

@@ -70,6 +70,7 @@ void DynamoFilter::onEncodeComplete(const Buffer::Instance& data) {

std::string body = buildBody(encoder_callbacks_->encodingBuffer(), data);
if (!body.empty()) {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

std::string ThreadLocalStoreImpl::getTagsForName(const std::string& name, std::vector<Tag>& tags) {
std::string tag_extracted_name = name;
for (const TagExtractor& tag_extractor : tag_extractors_) {
tag_extracted_name = tag_extractor.updateTags(tag_extracted_name, tags);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused by this.. I would have naively thought we scan along the list until we find a match (either forward or backward), then stop with the tag extracted name and tag set we want to use. The loop here is a bit different, it seems to allow interaction between multiple tag rules, chaining. Can you elaborate on what the rationale is?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand what you're getting at, and I think that's a viable way of doing it, but the current design allows tags to be somewhat independent of one another, meaning each regex represents exactly one tag. If I were to add a new tag that would come later in the string after cluster.<cluster_name>. in the existing setup, I don't need to capture the cluster name as a part of that regex. Let's assume I want to capture every string that ends with _rq_\d{3} as a tag, which is similar to one of the existing tags. With your suggestion, I would need to write a different regex for each of the possible strings that precedes that (or make a single regex that captures all possibilities). It's obviously not combinatorial because most tags only show up in one place, but it seems less flexible overall, especially if you're trying to capture a common request qualifier (like response code) that comes at the end of a few differently prefixed metrics. I also think this would lead to a more complex regex grammar (more capture groups, possibly named). However, I do agree that the current approach can lead to unexpected ordering interactions. The existing tests ensure that the defaults work as expected both in forward iteration and back, and any custom additions have a defined ordering according to the config so that users can depend on a specified ordering.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, OK, this makes sense. You might want to comment on this somewhere if not already.

@@ -168,6 +177,7 @@ class TestIsolatedStoreImpl : public StoreRoot {

// Stats::StoreRoot
void addSink(Sink&) override {}
void addTagExtractor(const TagExtractor&) override {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't the list of tag extractors just be a const as well and so we could share that directly?

Signed-off-by: Matt Rice <mattrice@google.com>
Signed-off-by: Matt Rice <mattrice@google.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very neat. Thanks. Some high level questions from a quick read through. Need to go more in depth.

bool used() const override { return data_.flags_ & RawStatData::Flags::Used; }
uint64_t value() const override { return data_.value_; }

// Stats::Metric
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can probably avoid duplication here by having a MetricImpl and then doing virtual inheritance. I don't feel that strongly either way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think that makes sense.

};

typedef std::shared_ptr<Timer> TimerSharedPtr;

/**
* A histogram that captures values one at a time.
*/
class Histogram : public Metric {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Is it worth creating a full on histogram interface vs. just having Timer take some kind of flag as an indication to the stats sink? They are basically the same thing.

Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way to approach this might be to order the matching and match the most specific first. TBH, I find it hard to parse these regexes with the use of lookahead and non-greedy matching like this, but maybe this is just a subjective take and others don't mind. I'm cool either way, but I feel that there should be lots of tests covering the corner cases to make sure you get these right (I haven't looked through all tests yet).

This is a look forward. Because it may be cluster.cluster_name.grpc or cluster.grpc depending on the order in which these are processed (or not enabled, etc), this ensures that there is a . after cluster, but does not "consume" the . in case there is just a single . between cluster and grpc.


/**
* Updates the tag extracted name and the set of tags by extracting the tag represented by this
* TagExtractor. If the tag is not represented in the current tag_extracted_name, nothing will be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The literal tag_extracted_name now no longer is a thing in this comment, so all references to it should be updated. Also, it's not really an "update" operation anymore.

std::unordered_map<std::string, std::string> regex_mapping;

// cluster.<cluster_name>.
regex_mapping[CLUSTER_NAME] = "^cluster\\.(([^.]*)\\.)";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the non-greedy operator here might be fine as well, I wasn't aware of it until you pointed it out. If you have sufficient unit tests for the tag extraction, it should be pretty clear how it's behaving. It's probably clearer to write as a non-greedy regex if it works in this scenario.

* modified.
* @param tag_extracted_name name from which the tag will be removed if found to be
* represented in the name.
* @param tags list of tags updated with the tag name and value if found in the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to be extra clear here that you are concatenating and preserving existing list.

std::string ThreadLocalStoreImpl::getTagsForName(const std::string& name, std::vector<Tag>& tags) {
std::string tag_extracted_name = name;
for (const TagExtractor& tag_extractor : tag_extractors_) {
tag_extracted_name = tag_extractor.updateTags(tag_extracted_name, tags);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, OK, this makes sense. You might want to comment on this somewhere if not already.

// Add custom tags.
for (const envoy::api::v2::TagSpecifier& tag_specifier : bootstrap.stats_config().stats_tags()) {
add_tag(tag_specifier.tag_name(), tag_specifier.regex());
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this duplication in the config validation implementation? I'm hoping we can avoid rewriting logic like this where possible. Can it be refactored if it is needed?

Property(&Stats::Metric::name, "prefix.dynamodb.operation.Get.upstream_rq_time_2xx"), _));
EXPECT_CALL(
stats_,
deliverTimingToSinks(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might help make the review easier if you can split out the timer/histogram changes from the tag extraction. LMK if that sounds (un)reasonable (I can share some git tips for doing surgery like this quickly).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to splitting that part out. Also, related to my other question as to whether we want to both with a full class hierarchy for histograms and timers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to split out the changes not directly related to tags (Metric, class hierarchy, and removing support for flushing histograms and timers without an explicit object) as the "first stage". Once that gets merged, we can review the regex and tags portion that relies on the first stage. Does that plan seem reasonable to y'all?

And in reference to your comment @mattklein123, I agree that we should condense them into a single class in the interest of code bloat. Still thinking about how to do this. Conceptually it seems like histogram is really the more general of the two types. Either way, it makes sense that a simple flag or enum could differentiate between them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was just thinking about this a bit and it gets ugly across a few different dimensions:

  • should histograms/timers be in different stat namespaces (What if you have "foo" timer and "foo" histogram).
  • the existence of the timespan class and allocateSpan() method

It might be better to just treat them as different things and do what you did. :(

The only other thing I can think of would be to:

  • Rename Timer -> Histogram
  • Remove Timespan class / allocateSpan() (this could be made into a helper class to snap a time and deliver it to a histogram when complete() is called).
  • Have some method on a Histogram which tells what the type is (time, raw value, etc.)

I kind of like ^ but it would be a bunch of work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was running into a little bit of both of those. I was going in the direction of your suggestion. I think it's the cleanest implementation. It might not be too bad. Timespan actually wouldn't have to change much (or at all). However, users would just have to directly construct a TimespanImpl, which adds the dependence on stats_impl for them instead of just using the stats interface classes. Not sure if that matters too much, though.

As for the namespace issue, I think that's messier. On one hand, we won't be sparing ourselves as much code if we still have to pipe them around separately in all of the stats management classes because they're considered to be distinct. On the other hand, if they're in the same namespace, we would probably still make separate macros for them as to set the interpretation flag correctly. This could definitely cause confusion in that that they would be the only two macros that can collide, and a collision would, AFAIU, happen silently and create a totally nonsensical combined output.

I'll give it a little more thought.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, users would just have to directly construct a TimespanImpl, which adds the dependence on stats_impl for them instead of just using the stats interface classes. Not sure if that matters too much, though.

I think you could stick this in its own file, or even put it at the interface level since it's so tiny and just depends on the interface.

re: namespacing, IMO it's fine to have them in the same namespace. I agree it's confusing. I think you could likely treat a mismatch as a fatal error as its a programming error IMO that should be caught during tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

@mrice32
Copy link
Member Author

mrice32 commented Oct 2, 2017

Closing. Will introduce as two separate PRs as @htuch and @mattklein123 have suggested.

@mrice32 mrice32 closed this Oct 2, 2017
htuch pushed a commit that referenced this pull request Oct 6, 2017
Tags will require all stats have non-trivial state associated with them. This enforces that all stats are represented as objects before being exported. As a part of this addition, there are a few structural changes to the way stats work in Envoy:

* The Metric calss has been added as a base class for all stats objects. This will contain identifying information.
* The Histogram class has replaced the Timer class. This object represents histograms of any data that can be represented by uint64_t. It also contains a method type() to identify the type of data represented (this includes Duration).
* Timespan has been moved into an independent class, which will be created with a reference Histogram.
* HISTOGRAM macros have been added to create Integer histogram objects.
This is the first stage of what was previously #1751 with some modifications based on the discussion in that PR.

Signed-off-by: Matt Rice <mattrice@google.com>
@mrice32 mrice32 deleted the stats_tags branch May 23, 2018 01:30
rshriram pushed a commit to rshriram/envoy that referenced this pull request Oct 30, 2018
* Optimize Referenced::Signature function

* Update per comment

* Add a string map sub key test

* rename CheckAbsenceKeys to CheckAbsentKeys
jpsim pushed a commit that referenced this pull request Nov 28, 2022
…1751)

Description: enable tests ensuring that the Connection header is returned with the response header
Risk Level: N/A
Testing: enables tests
Docs Changes: N/A
Release Notes: N/A

Signed-off-by: Charles Le Borgne <cleborgne@google.com>
Signed-off-by: JP Simard <jp@jpsim.com>
jpsim pushed a commit that referenced this pull request Nov 29, 2022
…1751)

Description: enable tests ensuring that the Connection header is returned with the response header
Risk Level: N/A
Testing: enables tests
Docs Changes: N/A
Release Notes: N/A

Signed-off-by: Charles Le Borgne <cleborgne@google.com>
Signed-off-by: JP Simard <jp@jpsim.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants