-
Notifications
You must be signed in to change notification settings - Fork 2.6k
replace strings.Split(N) for strings.Cut() or alternatives #3769
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Oh! Interesting failure; either I broke something, or we previously didn't validate properly
|
if prior := r.Header.Get("X-Forwarded-For"); prior != "" { | ||
proxies := strings.Split(prior, ",") | ||
if len(proxies) > 0 { | ||
remoteAddr := strings.Trim(proxies[0], " ") | ||
if parseIP(remoteAddr) != nil { | ||
return remoteAddr | ||
} | ||
remoteAddr, _, _ := strings.Cut(prior, ",") | ||
remoteAddr = strings.Trim(remoteAddr, " ") | ||
if parseIP(remoteAddr) != nil { | ||
return remoteAddr | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My first suspect for the failure may be related to this change, and I was actually looking at that;
- Old code would see if
X-Forwarded-For
header is set (and non-empty) - But if there's no "comma" in the header, it would be just ignored
- If there was a comma, we would pick the first value, but ignore the following ones
This felt a bit odd (but maybe there was a specific reason for this); why ignore the Header if there's only 1 "X-Forwarded-For"?
edit: format is: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-Forwarded-For
X-Forwarded-For: <client>, <proxy1>, <proxy2>
If a request goes through multiple proxies, the IP addresses of each successive proxy is listed. This means that, given well-behaved client and proxies, the rightmost IP address is the IP address of the most recent proxy and the leftmost IP address is the IP address of the originating client.
And the examples on that page (obviously, MDN is not the canonical source of truth) shows examples with only the <client>
IP, so this should be valid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if there's no "comma" in the header, it would be just ignored
Are you sure about this? Maybe I'm missing something but https://pkg.go.dev/strings#Split
if s does not contain sep and sep is not empty, Split returns a slice of length 1 whose only element is s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, 🙈 you're right; guess I had a mental "off by one" moment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the length check was at least redundant
🤦 that's not the failure; those error messages are "expected", and the test is actually passing 😂
Actual failure is this one;
|
So, the assumption there is that the official images namespace ( While that's the case currently, (Docker Hub doesn't use nested namespaces), I wonder if that assumption is correct. I don't think there's plans (currently) to do nested namespaces under I can fix the code, but perhaps I should add a |
1561720
to
4a4d112
Compare
@milosgajdos @corhere ptal (I realized this one also touches the |
- use strings.HasPrefix() to check for the prefix we're interested in instead of doing a strings.Split() without limits. This makes the code both easier to read, and prevents potential situations where we end up with a long slice. - use consts for defaults; these should never be modified, so better to use consts for them to indicate they're fixed values. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
4a4d112
to
2b34b7e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL: strings.Cut
. Never used it before, but seems using cleans up some code real nice 👍
@thaJeztah can you squash the commits? |
Yes, it's a new feature and I like it as well; cleaner (in many cases), and more performant, which is a good combination.
I tried to split the commits into logical units. The "reference" one I split separate (with in mind that we may be splitting that package to a separate repo, so would give a cleaner commit history after splitting) |
context/http.go
Outdated
} | ||
|
||
switch parts[2] { | ||
switch strings.TrimPrefix(keyStr, "http.request.") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we already know (from line 194) that keyStr
has "http.request."
as its prefix, unconditionally slicing the string by a constant would be more optimal.
switch strings.TrimPrefix(keyStr, "http.request.") { | |
switch keyStr[len("http.request."):] { |
The compiler is smart enough to evaluate len(constant)
at compile time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the thing about slicing the string
s like this always makes me feel uneasy because Go slices string
s by bytes
not by rune
s, so I'm happy with Trim
ming in this case because it avoids "bad habits"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I must admit I went for "readability" here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we can completely drop the strings.HasPrefix()
. strings.TrimPrefix()
already handles that, and if there's no match in the switch, we'll continue with the fallback.
Changing that also makes the code easier to read, and we don't have to use the ugly goto fallback
diff --git a/context/http.go b/context/http.go
index 7471e572..7fb75790 100644
--- a/context/http.go
+++ b/context/http.go
@@ -191,10 +191,6 @@ func (ctx *httpRequestContext) Value(key interface{}) interface{} {
return ctx.r
}
- if !strings.HasPrefix(keyStr, "http.request.") {
- goto fallback
- }
-
switch strings.TrimPrefix(keyStr, "http.request.") {
case "uri":
return ctx.r.RequestURI
@@ -220,10 +216,11 @@ func (ctx *httpRequestContext) Value(key interface{}) interface{} {
if ct != "" {
return ct
}
+ default:
+ // no match; fall back to standard behavior below
}
}
-fallback:
return ctx.Context.Value(key)
}
@@ -296,10 +293,6 @@ func (irw *instrumentedResponseWriter) Value(key interface{}) interface{} {
return irw
}
- if !strings.HasPrefix(keyStr, "http.response.") {
- goto fallback
- }
-
irw.mu.Lock()
defer irw.mu.Unlock()
@@ -313,9 +306,10 @@ func (irw *instrumentedResponseWriter) Value(key interface{}) interface{} {
if contentType != "" {
return contentType
}
+ default:
+ // no match; fall back to standard behavior below
}
}
-fallback:
return irw.Context.Value(key)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we can completely drop the
strings.HasPrefix()
.strings.TrimPrefix()
already handles that, and if there's no match in the switch, we'll continue with the fallback.
Not really: strings.TrimPrefix
returns the input string unchanged if there is no match, so the switch would match both keyStr == "uri"
and "http.request.uri"
if we drop the strings.HasPrefix()
check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤦 yup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to use the full string in the switch (as we discussed) 👍
hosts := strings.SplitN(forwardedHost, ",", 2) | ||
host = strings.TrimSpace(hosts[0]) | ||
host, _, _ = strings.Cut(forwardedHost, ",") | ||
host = strings.TrimSpace(host) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside: this should probably be changed to textproto.TrimString
at some point, and the rest of the module audited for inappropriate uses of unicode functions in HTTP-related code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good one; TIL about that function (or at least, I don't recall I've used it before). Probably needs to be checked with punicode domains etc 🤔
minorPart := strings.Split(string(version), ".")[1] | ||
_, minorPart, _ := strings.Cut(string(version), ".") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is technically a functional change: Version("").minor()
will return strconv.ErrSyntax
instead of panicking. I just want to make sure you are aware.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me check on that on in context (checking out the branch locally)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, caught up; yes, indeed, previously it would panic. I'm assuming "panic" is not very desirable, so would consider it a bug fix? 😅
2b34b7e
to
7179b4e
Compare
@milosgajdos @corhere updated; PTAL (but pay attention to #3769 (comment) |
7179b4e
to
d23ed8a
Compare
Go 1.18 and up now provides a strings.Cut() which is better suited for splitting key/value pairs (and similar constructs), and performs better: ```go func BenchmarkSplit(b *testing.B) { b.ReportAllocs() data := []string{"12hello=world", "12hello=", "12=hello", "12hello"} for i := 0; i < b.N; i++ { for _, s := range data { _ = strings.SplitN(s, "=", 2)[0] } } } func BenchmarkCut(b *testing.B) { b.ReportAllocs() data := []string{"12hello=world", "12hello=", "12=hello", "12hello"} for i := 0; i < b.N; i++ { for _, s := range data { _, _, _ = strings.Cut(s, "=") } } } ``` BenchmarkSplit BenchmarkSplit-10 8244206 128.0 ns/op 128 B/op 4 allocs/op BenchmarkCut BenchmarkCut-10 54411998 21.80 ns/op 0 B/op 0 allocs/op While looking at occurrences of `strings.Split()`, I also updated some for alternatives, or added some constraints; - for cases where an specific number of items is expected, I used `strings.SplitN()` with a suitable limit. This prevents (theoretical) unlimited splits. - in some cases it we were using `strings.Split()`, but _actually_ were trying to match a prefix; for those I replaced the code to just match (and/or strip) the prefix. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Minor optimization :) Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
d23ed8a
to
842d4c0
Compare
// TODO(thaJeztah): this considers "vars.FOO" and "FOO" to be equal. | ||
// We need to check if that's intentional (could be a bug). | ||
if v, ok := ctx.vars[strings.TrimPrefix(keyStr, "vars.")]; ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I suspect this may be a bug as well here (noticed this after your last comment); i.e. vars.FOO
and FOO
would be considered equal. This was already in place, and I'd have to check history, so I left this as a TODO for now.
irw.mu.Lock() | ||
defer irw.mu.Unlock() | ||
return irw.status |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looked a bit at the code, and the Mutex appeared to be to protect the status
and written
fields, so I moved the locks to those specific cases. This allowed us to have a simple switch with all options, and prevents acquiring a lock where not needed (which could always be a risk).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@milosgajdos still LGTY? |
Go 1.18 and up now provides a strings.Cut() which is better suited for
splitting key/value pairs (and similar constructs), and performs better:
While looking at occurrences of
strings.Split()
, I also updated some for alternatives,or added some constraints;
strings.SplitN()
with a suitable limit. This prevents (theoretical) unlimited splits.
strings.Split()
, but actually were trying to matcha prefix; for those I replaced the code to just match (and/or strip) the prefix.