Skip to content

Conversation

cqc-alec
Copy link
Contributor

When symengine is statically linked to a library, especially with link-time optimization, it can result in segfaults due to this issue: https://en.cppreference.com/w/cpp/language/siof

This PR tries to fix the instances I have found. It's possible I've missed some, but at least this is enough to avoid the issue in my use case involving pybind11 (527 valgrind errors from importing the python module reduced to 0).

Typically the fix is some form of the initialize-on-first-use idiom. For the constants, I introduced lambdas for this purpose to avoid having to add () to every reference to them in the code base.

@cqc-alec cqc-alec marked this pull request as ready for review September 28, 2021 13:00
@cqc-alec
Copy link
Contributor Author

Coverage test failed but this report seems OK: https://app.codecov.io/gh/symengine/symengine/compare/master...fiasco/changes
Can I ignore the failure or should I try to add some test code to make it green?

This exercises the function `_sqrt_mod_tonelli_shanks` which was not previously
tested.
@rikardn
Copy link
Contributor

rikardn commented Sep 28, 2021

Interesting! I am trying to understand some things:

  1. Will this affect performance? Will the lambda be called each time a constant is used?
  2. Why did this only affect static compilation?
  3. Could it be solved in another way. I have seen proposals to use static in functions for constants. Could some situations use constexpr?

@lkeegan
Copy link
Member

lkeegan commented Sep 28, 2021

thanks @cqc-alec !

re performance, I found ./benchmarks/parsing_google was ~2x slower with this pr, but this was just due to copies being made of the static maps, with that fixed (f850139) the parsing benchmarks are not significantly different to the master branch.

I also looked at a few other benchmarks (but not all), only /benchmarks/eval_double1 was noticeably slower (~3.8ms for this pr vs ~3.6ms for master)

@cqc-alec
Copy link
Contributor Author

Thanks @lkeegan , yes passing the maps by reference is a good idea. I wonder why eval_double got slower.
@rikardn 1. The lambda won't be called each time, just once. 2. Actually I think it would be an issue with dynamic linkage too. 3. Some cases could perhaps be solved with constexpr but only for literal types, and many of these are vectors or maps so not sure it could work (at least with pre-C++20).

@cqc-alec
Copy link
Contributor Author

I tried making init_eval_double inline -- perhaps this helps a bit with performance. (I see some variation when running the benchmark making it hard to be sure if it has really made a difference, but judging from a few runs on my laptop it is usually < 5ms after compared with usually > 5ms before.)

@cqc-alec
Copy link
Contributor Author

The test failure looks like a transient network issue.

@lkeegan
Copy link
Member

lkeegan commented Sep 29, 2021

Here are the relative changes in timings for the benchmarks/bench_eval_double benchmark for this pr vs master (negative means faster than master, positive means slower). Looking at these I would say the performance looks fine, none are significantly different from zero. (using compare.py U test with 20 repetitions)

eval_double/2_mean                                      -0.0128
eval_double/8_mean                                      -0.0048
eval_double/64_mean                                     +0.0006
eval_double/512_mean                                    +0.0069
eval_double/4096_mean                                   +0.0031
eval_double/16384_mean                                  +0.0151
eval_double_visitor_pattern/2_mean                      -0.0113
eval_double_visitor_pattern/8_mean                      -0.0200
eval_double_visitor_pattern/64_mean                     -0.0222
eval_double_visitor_pattern/512_mean                    -0.0085
eval_double_visitor_pattern/4096_mean                   -0.0190
eval_double_visitor_pattern/16384_mean                  -0.0409
eval_double_single_dispatch/2_mean                      +0.0065
eval_double_single_dispatch/8_mean                      +0.0037
eval_double_single_dispatch/64_mean                     +0.0001
eval_double_single_dispatch/512_mean                    -0.0015
eval_double_single_dispatch/4096_mean                   -0.0203
eval_double_single_dispatch/16384_mean                  -0.0249

@rikardn
Copy link
Contributor

rikardn commented Oct 1, 2021

Thanks @cqc-alec for the explanation. I have been giving this some thought and I believe your solution is the best we can get for now.

Ideally we would like to have static construction for the constants (i.e. constexpr), but it is not straight forward to make the RCP be statically constructed. I did some experimentation with smart pointers and I encountered various problems. I tested a nop destructor, but the compiler still does not want to allow constexpr (‘std::unique_ptr<int, nop>’ has a non-trivial destructor). Also having a non-standard destructor makes it into a different type than smart pointers with the same destructor so overloaded assignment operators would be needed. To sum up: it could be possible, but significant work would be needed.

@rikardn
Copy link
Contributor

rikardn commented Oct 6, 2021

Thanks again @cqc-alec

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants