-
Notifications
You must be signed in to change notification settings - Fork 378
Refactor/micro opts #255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor/micro opts #255
Conversation
faster: vector<bool> is space efficient but not fast since single bits need to be extracted from memory. Indexing with size_ts removes the need of zero-extending before indexing.
speeds up constructors of all operator implementations
get_matrix is called in some pretty hot functions. Getting rid of the call overhead shaves of up to 5% runtime for solomon vrptw instances
@krypt-n thanks for the PR, this looks great, especially the worst-case computing time improvement. I'll run some experiments on my side with various VRPTW/CVRP benchmarks and report. Only TSP won't be affected because it has it's own logic. I'm fine with all the changes, but I'd like to understand the Also is there a reason for not applying the same logic to the |
https://godbolt.org/z/LiwMb1 shows the difference between the two variants.
vroom far more often reads the values then it writes them. And as can be seen in the link above, the cast just results in a comparison with 0.
I measured with single thread by the way, multiple threads may show less speed gain. |
Thanks for the explanation. I'll report when I'm able to run some tests on my side. |
I did compare current master at 4486868 with this PR using 8c49bdb, running on my usual test machine with
This is great! @krypt-n I think we should mention this refactor in the changelog. |
That's great I added a changelog entry with the last commit. Apologies for the delay |
Issue
#254
Tasks
I had some old commits lying around that apply some micro optimizations with the goal of improving the performance of vroom somewhat. I think slight loss in maintainability and modularity is justified by the gained speedup.
Benchmark results on the Solomon VRPTW instances:
The computed solutions are exactly the same. I do expect similar computing time gains on other benchmarks, but I have not measured it yet