-
Notifications
You must be signed in to change notification settings - Fork 860
Improve binary exporter perf #5630
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you post the absolute benchmark results as well? Just to know what the 10% translate into in the larger scheme of things etc.
src/Npgsql/NpgsqlBinaryExporter.cs
Outdated
@@ -127,7 +127,48 @@ async Task ReadHeader(bool async) | |||
/// The number of columns in the row. -1 if there are no further rows. | |||
/// Note: This will currently be the same value for all rows, but this may change in the future. | |||
/// </returns> | |||
public int StartRow() => StartRow(false).GetAwaiter().GetResult(); | |||
public int StartRow() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a per-row thing, so a bit less convincing than per-column in terms of needing to split sync/async for micro-optimization, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Essentially, as you can see in the results, this is anything but a micro-optimization. It's really just quite expensive to run an async method for a small bit of work. Additionally it thwarts all inlining from the JIT due to the try finally present in AsyncMethodBuilder.Start (which is i.a. responsible for getting and resetting the sync ctx, thread static ops that are also not free)
I agree the frequency is lower but IMO it's still important, we try hard in the reader to stay sync as well.
With PR:
Without:
|
Just for testing I locally rebased this PR onto #5631 (which is based on #5480 and #5476) to get the following results:
Without this PR things get really ridiculous for the exporter:
|
Out of curiosity, how many rows/columns are being exported here? Maybe post the benchmark source? (not pushing back, just interested in more details) |
Benchmark is a bit hacked but it's basically #4944 (comment) So that's 1 million rows being read The actual calls into Npgsql are mostly here https://github.com/SoftStoneDevelop/NpgsqlBinaryExporter/blob/main/Src/NpgsqlBenchmark/Extensions/Extension.cs |
Will revert the split on StartRow for now. It constitutes about 5% of the total perf improvement while it increases maintainability costs. Can always reevaluate at a later moment. |
b461159
to
ac6b38d
Compare
Backported to 8.0.3 via 66969a5 |
Fixes #4944
After this PR, sync exporting ends up being 10% faster on my machine (M1 Max) than using sync NpgsqlDataReader + SequentialAccess. Testing was done based on the project that was linked to here #4944 (comment).
PR mostly consists of splitting apart sync and async paths. Having sync column reads run at the highest perf is as important here as it is in NpgsqlDataReader with GetFieldValue.