Add tags for stats extracted by configurable regexes. #1751

mrice32 · 2017-09-26T21:56:15Z

Add configurable tags for stats objects. As a part of this process, a few new concepts were added:

TagExtractor - an object that is used to extract a particular tag from stat names.
Metric - a general interface for metrics that provides identifying information.
Histograms - these existed in some form, but now there is an explicit class for them. Note: I didn't add a macro for the creation of these. This is mainly because it would be unused, but I am open to adding one if there's demand.

As a part of this addition, there is a set of default regexes. These have been tested, but I am especially interested in comments on the fragility of these. I'm somewhat new to writing regex, so please let me know if you find any issues.

This is still missing tests for the configuration and reading via a test sink. I will be adding these, but wanted to go ahead and get this out for review to get a head start.

htuch

Just starting to look at this, it's a big PR. At a high level it looks good. Will have some more concrete feedback tomorrow.

htuch · 2017-09-28T03:08:07Z

include/envoy/stats/stats.h

+   * Updates the tag extracted name and the set of tags by extracting the tag represented by this
+   * TagExtractor. If the tag is not represented in the current tag_extracted_name, nothing will be
+   * modified.
+   * @param[in,out] tag_extracted_name name from which the tag will be removed if found to be


Side comment: I don't think I've seen this @param syntax used before in Envoy, but as long as it's valid Doxygen then seems useful. I think to some extent this is implied by argument constness already though.

Just a note on that (may be irrelevant considering we're going to make this the return value). A lack of const only tells you that the object will be written to. That's generally the default for non-const inputs (at least for simple stdlib containers and primitives). However, the tag_extracted_name's current state will change the logical behavior of the function (whether a tag is actually extracted) and the state will be modified, so it's [in,out].

htuch · 2017-09-28T03:10:39Z

include/envoy/stats/stats.h

+   * Updates the tag extracted name and the set of tags by extracting the tag represented by this
+   * TagExtractor. If the tag is not represented in the current tag_extracted_name, nothing will be
+   * modified.
+   * @param[in,out] tag_extracted_name name from which the tag will be removed if found to be


I'd probably find it cleaner to make the argument immutable and just generate a new tag extracted name. I don't think this is on any hot path so no performance issues from an additional allocation.

htuch · 2017-09-28T03:15:25Z

include/envoy/stats/stats.h

@@ -207,6 +278,11 @@ class StoreRoot : public Store {
  virtual void addSink(Sink& sink) PURE;

  /**
+   * Add an extractor to extract a portion of stats names as a tag.
+   */
+  virtual void addTagExtractor(TagExtractor& tag_extractor) PURE;


Can this be const TagExtractor& since updateTags is const?

Signed-off-by: Matt Rice <mattrice@google.com>

htuch · 2017-09-28T21:55:37Z

include/envoy/stats/stats.h

+   * Updates the tag extracted name and the set of tags by extracting the tag represented by this
+   * TagExtractor. If the tag is not represented in the current tag_extracted_name, nothing will be
+   * modified.
+   * @param tag_extracted_name name from which the tag will be removed if found to be


Maybe rename to name_with_tag_included or something, now that it's no longer the extracted name in the arg?

htuch · 2017-09-28T22:09:33Z

source/common/config/well_known_names.cc

+  std::unordered_map<std::string, std::string> regex_mapping;
+
+  // cluster.<cluster_name>.
+  regex_mapping[CLUSTER_NAME] = "^cluster\\.((.*?)\\.)";


What is the purpose of ? here, given it follows a pattern that already matches anything? Also, .* is generally too powerful, since regexes will do longest match, and this will walk all over the next .. What you need is probably something like ^cluster\\.(([^.]*))\\.. Same comment applies below to other regexes.

The ? is meant to make the quantifier ungreedy. I'm not sure if this regex syntax is different from others, but see http://www.cplusplus.com/reference/regex/ECMAScript/ . Your suggestion seems to work, but I believe the above works too. I'll move to the ^. way of doing it since that seems simpler.

htuch · 2017-09-28T22:15:58Z

source/common/config/well_known_names.cc

+      "^http(?=\\.).*?\\.dynamodb\\..+?(\\.__partition_id=(\\w{7}))$";
+
+  // cluster.<route target cluster>.grpc.<grpc service>.
+  regex_mapping[GRPC_BRIDGE_SERVICE] = "^cluster(?=\\.).*?\\.grpc\\.((.+?)\\.)";


What are you trying to do this the (?= stuff here?

This is a look forward. Because it may be cluster.cluster_name.grpc or cluster.grpc depending on the order in which these are processed (or not enabled, etc), this ensures that there is a . after cluster, but does not "consume" the . in case there is just a single . between cluster and grpc.

htuch · 2017-09-28T22:17:32Z

source/common/dynamo/dynamo_filter.cc

@@ -70,6 +70,7 @@ void DynamoFilter::onEncodeComplete(const Buffer::Instance& data) {

  std::string body = buildBody(encoder_callbacks_->encodingBuffer(), data);
  if (!body.empty()) {
+


htuch · 2017-09-28T22:28:31Z

source/common/stats/thread_local_store.cc

+std::string ThreadLocalStoreImpl::getTagsForName(const std::string& name, std::vector<Tag>& tags) {
+  std::string tag_extracted_name = name;
+  for (const TagExtractor& tag_extractor : tag_extractors_) {
+    tag_extracted_name = tag_extractor.updateTags(tag_extracted_name, tags);


I'm a bit confused by this.. I would have naively thought we scan along the list until we find a match (either forward or backward), then stop with the tag extracted name and tag set we want to use. The loop here is a bit different, it seems to allow interaction between multiple tag rules, chaining. Can you elaborate on what the rationale is?

I think I understand what you're getting at, and I think that's a viable way of doing it, but the current design allows tags to be somewhat independent of one another, meaning each regex represents exactly one tag. If I were to add a new tag that would come later in the string after cluster.<cluster_name>. in the existing setup, I don't need to capture the cluster name as a part of that regex. Let's assume I want to capture every string that ends with _rq_\d{3} as a tag, which is similar to one of the existing tags. With your suggestion, I would need to write a different regex for each of the possible strings that precedes that (or make a single regex that captures all possibilities). It's obviously not combinatorial because most tags only show up in one place, but it seems less flexible overall, especially if you're trying to capture a common request qualifier (like response code) that comes at the end of a few differently prefixed metrics. I also think this would lead to a more complex regex grammar (more capture groups, possibly named). However, I do agree that the current approach can lead to unexpected ordering interactions. The existing tests ensure that the defaults work as expected both in forward iteration and back, and any custom additions have a defined ordering according to the config so that users can depend on a specified ordering.

Yeah, OK, this makes sense. You might want to comment on this somewhere if not already.

htuch · 2017-09-28T22:29:46Z

test/integration/server.h

@@ -168,6 +177,7 @@ class TestIsolatedStoreImpl : public StoreRoot {

  // Stats::StoreRoot
  void addSink(Sink&) override {}
+  void addTagExtractor(const TagExtractor&) override {}


Wouldn't the list of tag extractors just be a const as well and so we could share that directly?

Signed-off-by: Matt Rice <mattrice@google.com>

mattklein123

This is very neat. Thanks. Some high level questions from a quick read through. Need to go more in depth.

mattklein123 · 2017-09-29T16:51:46Z

source/common/stats/stats_impl.h

+  bool used() const override { return data_.flags_ & RawStatData::Flags::Used; }
+  uint64_t value() const override { return data_.value_; }
+
+  // Stats::Metric


You can probably avoid duplication here by having a MetricImpl and then doing virtual inheritance. I don't feel that strongly either way.

Yeah, I think that makes sense.

mattklein123 · 2017-09-29T16:55:26Z

include/envoy/stats/stats.h

 };

 typedef std::shared_ptr<Timer> TimerSharedPtr;

 /**
+ * A histogram that captures values one at a time.
+ */
+class Histogram : public Metric {


Question: Is it worth creating a full on histogram interface vs. just having Timer take some kind of flag as an indication to the stats sink? They are basically the same thing.

htuch

Another way to approach this might be to order the matching and match the most specific first. TBH, I find it hard to parse these regexes with the use of lookahead and non-greedy matching like this, but maybe this is just a subjective take and others don't mind. I'm cool either way, but I feel that there should be lots of tests covering the corner cases to make sure you get these right (I haven't looked through all tests yet).

This is a look forward. Because it may be cluster.cluster_name.grpc or cluster.grpc depending on the order in which these are processed (or not enabled, etc), this ensures that there is a . after cluster, but does not "consume" the . in case there is just a single . between cluster and grpc.

htuch · 2017-10-02T02:10:50Z

include/envoy/stats/stats.h

+
+  /**
+   * Updates the tag extracted name and the set of tags by extracting the tag represented by this
+   * TagExtractor. If the tag is not represented in the current tag_extracted_name, nothing will be


The literal tag_extracted_name now no longer is a thing in this comment, so all references to it should be updated. Also, it's not really an "update" operation anymore.

htuch · 2017-10-02T02:16:16Z

source/common/config/well_known_names.cc

+  std::unordered_map<std::string, std::string> regex_mapping;
+
+  // cluster.<cluster_name>.
+  regex_mapping[CLUSTER_NAME] = "^cluster\\.(([^.]*)\\.)";


Using the non-greedy operator here might be fine as well, I wasn't aware of it until you pointed it out. If you have sufficient unit tests for the tag extraction, it should be pretty clear how it's behaving. It's probably clearer to write as a non-greedy regex if it works in this scenario.

htuch · 2017-10-02T02:28:27Z

include/envoy/stats/stats.h

+   * modified.
+   * @param tag_extracted_name name from which the tag will be removed if found to be
+   * represented in the name.
+   * @param tags list of tags updated with the tag name and value if found in the


You might want to be extra clear here that you are concatenating and preserving existing list.

htuch · 2017-10-02T02:29:26Z

source/common/stats/thread_local_store.cc

+std::string ThreadLocalStoreImpl::getTagsForName(const std::string& name, std::vector<Tag>& tags) {
+  std::string tag_extracted_name = name;
+  for (const TagExtractor& tag_extractor : tag_extractors_) {
+    tag_extracted_name = tag_extractor.updateTags(tag_extracted_name, tags);


Yeah, OK, this makes sense. You might want to comment on this somewhere if not already.

htuch · 2017-10-02T02:33:55Z

source/server/config_validation/server.cc

+  // Add custom tags.
+  for (const envoy::api::v2::TagSpecifier& tag_specifier : bootstrap.stats_config().stats_tags()) {
+    add_tag(tag_specifier.tag_name(), tag_specifier.regex());
+  }


Do we need this duplication in the config validation implementation? I'm hoping we can avoid rewriting logic like this where possible. Can it be refactored if it is needed?

htuch · 2017-10-02T02:36:01Z

test/common/dynamo/dynamo_filter_test.cc

+          Property(&Stats::Metric::name, "prefix.dynamodb.operation.Get.upstream_rq_time_2xx"), _));
+  EXPECT_CALL(
+      stats_,
+      deliverTimingToSinks(


It might help make the review easier if you can split out the timer/histogram changes from the tag extraction. LMK if that sounds (un)reasonable (I can share some git tips for doing surgery like this quickly).

+1 to splitting that part out. Also, related to my other question as to whether we want to both with a full class hierarchy for histograms and timers.

I think it makes sense to split out the changes not directly related to tags (Metric, class hierarchy, and removing support for flushing histograms and timers without an explicit object) as the "first stage". Once that gets merged, we can review the regex and tags portion that relies on the first stage. Does that plan seem reasonable to y'all?

And in reference to your comment @mattklein123, I agree that we should condense them into a single class in the interest of code bloat. Still thinking about how to do this. Conceptually it seems like histogram is really the more general of the two types. Either way, it makes sense that a simple flag or enum could differentiate between them.

Yeah I was just thinking about this a bit and it gets ugly across a few different dimensions:

should histograms/timers be in different stat namespaces (What if you have "foo" timer and "foo" histogram).

the existence of the timespan class and allocateSpan() method

It might be better to just treat them as different things and do what you did. :(

The only other thing I can think of would be to:

Rename Timer -> Histogram

Remove Timespan class / allocateSpan() (this could be made into a helper class to snap a time and deliver it to a histogram when complete() is called).

Have some method on a Histogram which tells what the type is (time, raw value, etc.)

I kind of like ^ but it would be a bunch of work.

Yeah, I was running into a little bit of both of those. I was going in the direction of your suggestion. I think it's the cleanest implementation. It might not be too bad. Timespan actually wouldn't have to change much (or at all). However, users would just have to directly construct a TimespanImpl, which adds the dependence on stats_impl for them instead of just using the stats interface classes. Not sure if that matters too much, though.

As for the namespace issue, I think that's messier. On one hand, we won't be sparing ourselves as much code if we still have to pipe them around separately in all of the stats management classes because they're considered to be distinct. On the other hand, if they're in the same namespace, we would probably still make separate macros for them as to set the interpretation flag correctly. This could definitely cause confusion in that that they would be the only two macros that can collide, and a collision would, AFAIU, happen silently and create a totally nonsensical combined output.

I'll give it a little more thought.

However, users would just have to directly construct a TimespanImpl, which adds the dependence on stats_impl for them instead of just using the stats interface classes. Not sure if that matters too much, though.

I think you could stick this in its own file, or even put it at the interface level since it's so tiny and just depends on the interface.

re: namespacing, IMO it's fine to have them in the same namespace. I agree it's confusing. I think you could likely treat a mismatch as a fatal error as its a programming error IMO that should be caught during tests.

Sounds good.

mrice32 · 2017-10-02T23:08:08Z

Closing. Will introduce as two separate PRs as @htuch and @mattklein123 have suggested.

Tags will require all stats have non-trivial state associated with them. This enforces that all stats are represented as objects before being exported. As a part of this addition, there are a few structural changes to the way stats work in Envoy: * The Metric calss has been added as a base class for all stats objects. This will contain identifying information. * The Histogram class has replaced the Timer class. This object represents histograms of any data that can be represented by uint64_t. It also contains a method type() to identify the type of data represented (this includes Duration). * Timespan has been moved into an independent class, which will be created with a reference Histogram. * HISTOGRAM macros have been added to create Integer histogram objects. This is the first stage of what was previously #1751 with some modifications based on the discussion in that PR. Signed-off-by: Matt Rice <mattrice@google.com>

* Optimize Referenced::Signature function * Update per comment * Add a string map sub key test * rename CheckAbsenceKeys to CheckAbsentKeys

…1751) Description: enable tests ensuring that the Connection header is returned with the response header Risk Level: N/A Testing: enables tests Docs Changes: N/A Release Notes: N/A Signed-off-by: Charles Le Borgne <cleborgne@google.com> Signed-off-by: JP Simard <jp@jpsim.com>

mrice32 force-pushed the stats_tags branch 2 times, most recently from e4fac1f to a926107 Compare September 26, 2017 22:12

htuch reviewed Sep 28, 2017

View reviewed changes

mrice32 added 5 commits September 28, 2017 15:13

Add tags for stats extracted by configurable regexes.

0892986

Signed-off-by: Matt Rice <mattrice@google.com>

add comments

488bcea

Signed-off-by: Matt Rice <mattrice@google.com>

fix bug in TagExtractorImpl

f95b506

Signed-off-by: Matt Rice <mattrice@google.com>

added constness to TagExtractor

3a365dc

Signed-off-by: Matt Rice <mattrice@google.com>

return and pass const string

6491456

Signed-off-by: Matt Rice <mattrice@google.com>

mrice32 force-pushed the stats_tags branch from 94a4152 to 052a1be Compare September 28, 2017 19:13

Merge branch 'master' into stats_tags

7259085

Signed-off-by: Matt Rice <mattrice@google.com>

mrice32 force-pushed the stats_tags branch from 052a1be to 7259085 Compare September 28, 2017 19:17

fix formatting

c58a15c

Signed-off-by: Matt Rice <mattrice@google.com>

mrice32 force-pushed the stats_tags branch from a3b2c98 to c58a15c Compare September 28, 2017 20:59

htuch reviewed Sep 28, 2017

View reviewed changes

mrice32 added 2 commits September 29, 2017 12:38

responded to non-regex comments

86f0c31

Signed-off-by: Matt Rice <mattrice@google.com>

Modified regex as htuch suggested

d1a9e13

Signed-off-by: Matt Rice <mattrice@google.com>

mattklein123 reviewed Sep 29, 2017

View reviewed changes

htuch reviewed Oct 2, 2017

View reviewed changes

mrice32 closed this Oct 2, 2017

mrice32 mentioned this pull request Oct 3, 2017

Update stats interface in preparation for tags #1803

Merged

mrice32 mentioned this pull request Oct 13, 2017

Add configurable stats tags #1852

Merged

mrice32 deleted the stats_tags branch May 23, 2018 01:30

		@@ -70,6 +70,7 @@ void DynamoFilter::onEncodeComplete(const Buffer::Instance& data) {

		std::string body = buildBody(encoder_callbacks_->encodingBuffer(), data);
		if (!body.empty()) {

Add tags for stats extracted by configurable regexes. #1751

Add tags for stats extracted by configurable regexes. #1751

Uh oh!

Conversation

mrice32 commented Sep 26, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

htuch left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mrice32 Sep 28, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mrice32 Sep 28, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mrice32 Sep 28, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mattklein123 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

htuch left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mrice32 commented Sep 26, 2017 •

edited

Loading

mrice32 Sep 28, 2017 •

edited

Loading

mrice32 Sep 28, 2017 •

edited

Loading

mrice32 Sep 28, 2017 •

edited

Loading