-
-
Notifications
You must be signed in to change notification settings - Fork 655
Description
A refactor in #27946 introduced "unprotected" (not surrounded by GAP_Enter
/GAP_Leave
) GAP_ValueGlobalVariable
calls. I believe this might be a GC hazard, because after updating to GAP 4.12.1 I started seeing aarch64 crashes on NixOS infrastructure such as:
#0 0x0000fffff79740e8 in wait4 ()
#1 0x0000fffff5dc6b78 in print_enhanced_backtrace ()
#2 0x0000fffff5dc8190 in sigdie ()
#3 0x0000fffff5dcb1c0 in cysigs_signal_handler ()
#4 0x0000fffff7ffb7cc in __kernel_rt_sigreturn ()
#5 0x0000ffff99a0bf28 in ConvString ()
#6 0x0000000000000000 in ?? ()
#7 0x0000000000000000 in ?? ()
#8 0x0000000000000000 in ?? ()
#9 0x0000ffff99989930 in Pr ()
#10 0x0000ffff9998aa18 in CloseOutput ()
#11 0x0000ffff99884828 in capture_stdout () at /build/sage-src-9.7/pkgs/sagemath-standard/sage/libs/gap/element.pyx:154
...
I also see cases where capture_stdout
throws errors such as sage.libs.gap.util.GAPError: Error, Length: <list> must be a list (not the integer 255)
and then crashes. Both types of errors are fixed by this ticket.
Note that I am nesting GAP_Enter
/GAP_Leave
calls because I didn't remove the preexisting calls inside capture_stdout
. That's because I feared removing the innermost calls might create a new footgun (and I believe nested GAP_Enter
/GAP_Leave
calls are explicitly supported), but removing them should cause no problem. Removing them might even be preferable for performance reasons, I don't know.
This ticket's branch is based on #34391.
(Edit: I previously thought that holding onto a char*
past the GAP_Enter
/GAP_Leave
blocks, as is done in GapElement's __str__
and _repr_
methods, could also cause GC hazards. It doesn't seem to be a problem in practice, though, and I am not qualified to tell if it's a problem in theory.)
Depends on #34391
Component: interfaces
Author: Mauricio Collares
Issue created by migration from https://trac.sagemath.org/ticket/34701
Edit: Removed branch. The code changes are now at #35114.