Skip to content

Conversation

tobert
Copy link
Collaborator

@tobert tobert commented Jan 6, 2023

#109 reports OTEL_CLI_ATTRIBUTES and OTEL_CLI_SERVICE_NAME aren't working. This PR will fix that.

tobert added 15 commits January 6, 2023 15:04
This never should have worked. Next commit will fix the span data
checker.
Working on a test for #109 it became clear there were some faults in how
checkSpanData worked. For one it wasn't validating that an expected
'is_sampled' was actually in the received span. It turns out that field
was never available for comparison. That check was added.

Also only the regex fields were being checked because of looping over
that map so I moved it up to a global map and now loop over received
fields and compare against instead of the other way around. This should
be more exhaustive.
In order to test service name setting, this needs to come through in the
tests. This patch does that.

Refactored attribute flattening code to its own function.
This test should pass once the bug is fixed.
also test multiple attributes are handled correctly
viper seems to need these. the service name envvar works now.
OTEL_CLI_ATTRIBUTES still does not work.
test fails in a more interesting way now

always check your errors, kids
The one in pflags is buried deep so we need our own to make Viper do the
right thing.
This isn't ideal since the type information is there somewhere and it
could be converted cleanly. That said, I dug through the grpc and otel
code trying to figure it out and didn't make much progress. Since this
server isn't super serious, it should be ok to clean up with string
manipulation and get all the tests to work as expected. This seems to
work for now so proper type conversion can happen another day. Sorry.
By default viper will parse JSON in envvars but not the comma-separated
k=v style that cobra/pflag does. This implements a hook to intercept
those values and decode them manually, which makes OTEL_CLI_ATTRIBUTES
finally work as advertised.
For now, let's make sure folks can build widely. Probably should make a
decision soon about minimum Go compiler though.
Copy link
Contributor

@emattiza emattiza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. If roundtrip on the env vars and the trace output is good, then it's doing its thing!

@tobert tobert mentioned this pull request Jan 10, 2023
Copy link

@chuson chuson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to you if you want to make changes where I commented, LGTM otherwise. 👍

// TODO: this isn't great, there are ways it can cause mayhem, but
// should fine for known otlpserver use cases
val := attr.Value.String()
parts := strings.SplitN(val, "_value:", 2)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be safer to split on the colon and then later truncate "_value"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they're about the same. I want this to be fragile and break tests if some new representation shows up. Also I originally used strings.Cut but switched to SplitN to defer raising the minimum go version for otel-cli. It looked better with Cut, I thought about changing it, but chose fewer lines of code in the end.

tobert added a commit that referenced this pull request Jan 11, 2023
It seems like some documented environment variables never worked, or haven't in a while.
When I wrote the first pass, I included Viper because it was recommended with Cobra, and
maybe a config file will be more useful than I'm imagining right now. A couple things were
broken such that config loading and environment variables via Viper were not working at
all. I think the critical ones like `OTEL_EXPORTER_OTLP_ENDPOINT` worked because the
OpenTelemetry SDK picked it up directly.

So really this is a story about testing. We weren't testing environment variable loading.
There is still work to do but now there are some tests. The test harness already had most
everything I needed to build the tests so I did that first. At least now we won't be
embarrassed again.

I'll mention the fixes to Viper for posterity as they were tricky to figure out and should
become a post of their own. The first is that viper.Unmarshal uses mapstructure under
the covers so it's the `mapstructure:"tag"` struct tag that it needs to map pflag names
to the struct fields. The second is that Viper expects JSON maps in envvars, not the
k=v,k=v format otel-cli uses elsewhere and documents. So I had to add this, which
also took a minute to figure out:
https://github.com/tobert/otel-cli/blob/109-fix-envvar-settings/otelcli/root.go#L158

After getting tests passing with Viper in a PR, I went looking at issues and read #83 
and decided to take a little detour and see what removing Viper looked like. I always
wanted to try doing struct tags by hand too so I gave that a go and that's what we're
going with for now. So this PR / squash merge is a derivative of the Viper PR in #117 
that completely removes it for custom code, and ends up shrinking the binary on
x86_64 by 1.2 megabytes. That was enough for me to polish up the removal and land
us here.

Normally I would avoid such a big behavioral change, knowing folks are using otel-cli
on their systems, but since the functionality has been broken so long, now's the time
to make the cut and clean up a little bit.

-Amy

Fixes: #83 #109

Commit history follows:

* is_sampled was never available in span data

This never should have worked. Next commit will fix the span data
checker.

* rework checkSpanData for correctness

Working on a test for #109 it became clear there were some faults in how
checkSpanData worked. For one it wasn't validating that an expected
'is_sampled' was actually in the received span. It turns out that field
was never available for comparison. That check was added.

Also only the regex fields were being checked because of looping over
that map so I moved it up to a global map and now loop over received
fields and compare against instead of the other way around. This should
be more exhaustive.

* add failing test for #109

* plumb service attributes into CliEvent

In order to test service name setting, this needs to come through in the
tests. This patch does that.

Refactored attribute flattening code to its own function.

* make a failing test for service name

This test should pass once the bug is fixed.

* make attr map flattening more ergonomic

* also test OTEL_RESOURCE_ATTRIBUTES

also test multiple attributes are handled correctly

* add mapstructure struct tags

viper seems to need these. the service name envvar works now.
OTEL_CLI_ATTRIBUTES still does not work.

* check error on viper.Unmarshal 🤦

test fails in a more interesting way now

always check your errors, kids

* add a comment

* add parseCkvStringMap helper & test

The one in pflags is buried deep so we need our own to make Viper do the
right thing.

* hack: strip types off stringified span attributes

This isn't ideal since the type information is there somewhere and it
could be converted cleanly. That said, I dug through the grpc and otel
code trying to figure it out and didn't make much progress. Since this
server isn't super serious, it should be ok to clean up with string
manipulation and get all the tests to work as expected. This seems to
work for now so proper type conversion can happen another day. Sorry.

* implement & use mapstructure decode hook

By default viper will parse JSON in envvars but not the comma-separated
k=v style that cobra/pflag does. This implements a hook to intercept
those values and decode them manually, which makes OTEL_CLI_ATTRIBUTES
finally work as advertised.

* use strings.SplitN instead of Cut for older go

For now, let's make sure folks can build widely. Probably should make a
decision soon about minimum Go compiler though.

* replace viper with custom environment loading

We discussed this in #83. It seems to be less code overall, even before
removing the dependency.

* clean up TODOs, wrap errors

* make env parse failures a soft failure

I'm not super sure this is the right move but feels like it follows
policy.

* add envvar failure test, add Results.CliOutputRe

Wrote a test to verify parsing fails on an invalid environment variable
value.

Because softFail uses log.Fatal, it includes a date stamp which is the
right thing to do. To make testing easier, this called for a way to
modify the output before comparing to the fixture data, thus I added
Results.CliOutputRe which takes a regex and always deletes the match.

* remove accidental comment

* re-implment config file loading

It's json-only now, and an example is added. I'm not thrilled about the
ordering of Defaults -> CLI -> environment -> file but it works and is
consistent.

* update README docs for configuration

* fix up config test, add needed infra for it

In order to fix the config test, a "*" option needed to be added to the
test harness so I did that. Cleaned up up README and other things
after getting the test right, so there are a few odds & ends in this
commit that are related.

* correct comment
@tobert
Copy link
Collaborator Author

tobert commented Jan 11, 2023

Thanks for the reviews. We need to get both of you on the approver list.

Superseded by #120.

@tobert tobert closed this Jan 11, 2023
@tobert tobert deleted the 109-fix-envvar-settings branch May 10, 2023 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants