-
Notifications
You must be signed in to change notification settings - Fork 829
fix: allows all non-'}' characters as format values #2065
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: allows all non-'}' characters as format values #2065
Conversation
The existing implementation used .IsPunctuation + IsLetterOrDigit with an override for '+' and ' '. That left some conspicuous holes and the spec itself is just "^\}" so basically anything goes in a format string. This allows for them and this does seem to have a reasonably large impact on the DefaultConsoleOutputTemplate parsing method. Everything else was within error bounds and no impact at all on memory allocs.
Thanks for checking this out! I think we might end up just wanting Console.WriteLine(char.IsLetterOrDigit('む'));
-> True (So previously |
That seems reasonable. Though I'm assuming control characters/non-printables/linebreaks? Should it be maybe anything not in these character classes?
EDIT: |
Yeah, I don't think it's something that requires policing by Serilog, since the rest of the message template (outside the holes) also supports arbitrary escapes/control characters. .NET accepts just about anything in format strings, and it seems like C#'s string interpolation follows suit. Console.WriteLine($"Hello {new object():\0}");
-> Hello System.Object |
Interesting - is there a place where this is doc'd in dotnet or is it just "experiment and find out" - I went hunting for what was allowed but got lazy 😄 I'll make that swap now though! |
New benches, interestingly the conversion to bytes vs as chars saves about 1/8 the comparison. Since we don't really care about char specific comparisons to Edit: After a few more attempts seems like the perf diffs with the byte cast may be smaller and this might just be noisy data - I'm not running this on a dedicated benching box I'm just running it on my desktop while I'm running other things so might not be the best dataset. That being said this does end up being about 55ns faster than baseline New benches: BenchmarkDotNet=v0.13.5, OS=Windows 11 (10.0.22631.3593)
AMD Ryzen 9 5900X, 1 CPU, 24 logical and 12 physical cores
.NET SDK=8.0.205
[Host] : .NET 8.0.5 (8.0.524.21615), X64 RyuJIT AVX2
DefaultJob : .NET 8.0.5 (8.0.524.21615), X64 RyuJIT AVX2
BenchmarkDotNet=v0.13.5, OS=Windows 11 (10.0.22631.3593)
AMD Ryzen 9 5900X, 1 CPU, 24 logical and 12 physical cores
.NET SDK=8.0.205
[Host] : .NET 8.0.5 (8.0.524.21615), X64 RyuJIT AVX2
DefaultJob : .NET 8.0.5 (8.0.524.21615), X64 RyuJIT AVX2
|
char.IsPunctuation(c) || | ||
c is ' ' or '+'); | ||
} | ||
static bool IsValidInFormat(char c) => c != (byte)'}'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The byte conversion difference here is just noise; both versions end up emitting the same assembly (using godbolt.org for a quick check).
I'll drop out the cast and merge this so it's all ready for v4 :-) 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wicked, on 500ish ns noise ends up so impactful :P
Hey it's me again!
The existing implementation used .IsPunctuation + IsLetterOrDigit with an override for '+' and ' '. That left some conspicuous holes and the spec itself is just "^}" so basically anything goes in a format string. This allows for them and this does seem to have a reasonably large impact on the DefaultConsoleOutputTemplate parsing method. Everything else was within error bounds and no impact at all on memory allocs.
Some examples of oddities was that
@
was allowed but$
wasn't and+
was allowed but>, <, =
weren'tfor posterity this is based on our conversation here: messagetemplates/grammar#6
BEFORE - Allocs
BEFORE - Parsing
AFTER - Allocs
AFTER - PARSING
Change in allowed printables before and after: